Byte Size Series - Limiting CPU and Memory Usage with Cgroups

The rise of containers and their adoption across the industry with technologies like Docker and Kubernetes has been remarkable, and with the advent of AI, these technologies are poised to become fundamental not only for training and inference of AI models but also for hosting AI services themselves. Beneath these abstractions lies the fundamental idea that allows us to start a Linux process while limiting its CPU and memory resources using Control Groups , also know as Cgroups. In this entry of the byte size series, we’ll explore how we can use systemd , one of the most used init systems in many Linux distros, to configure and add a process to a Cgroup.

What is a Cgroup ?

From the man pages , we get the following:

Control groups, usually referred to as cgroups, are a Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. The kernel’s cgroup interface is provided through a pseudo-filesystem called cgroupfs. Grouping is implemented in the core cgroup kernel code, while resource tracking and limits are implemented in a set of per-resource-type subsystems (memory, CPU, and so on).

From this definition, we see that interacting with the Cgroup interface is done via the pseudo-filesystem cgroupfs, usually mounted at /sys/fs/cgroup. To create a Cgroup manually, we would simply create a subdirectory, which then gets populated with files used to manipulate the Cgroup configuration.

Let’s move one level above this low-level abstraction and use systemd to control a Cgroup, which makes the work much easier. In fact, systemd performs operations under the pseudo-filesystem as if you were using it through shell commands directly. For example, creating a Cgroup manually would involve running mkdir on /sys/fs/cgroup.

Adding a Process to a Cgroup with Systemd (Transient Setup)

To demonstrate Cgroups, we will use spin_loop.py this is a simple program that loops forever adding more memory on each iteration.

spin_loop.py
data = []
while True:
    data.append([0] * 100)
Click to expand and view more

Let us now add this process to a Cgroup via systemd-run

BASH
systemd-run -u eatmem -p CPUQuota=20% -p MemoryMax=1G python ~/spin_loop.py
Click to expand and view more

In the command above:

Whenever the program exceeds these limits (as it will in this example by consuming memory indefinitely), the out-of-memory killer is triggered. By running the following command, we can confirm the status of our unit:

BASH
systemd status eatmem.service
Click to expand and view more
PLAINTEXT
Loaded: loaded (/run/systemd/transient/eatmem.service; transient)
Transient: yes
Active: failed (Result: oom-kill)
...
Click to expand and view more
Garbage collecting the unit

By default, systemd does not cleanup the transient unit, so if you run the command once again you will see.

This default behavior is what enables us to inspect logs, and status afterwards. If you would like to manually cleanup the unit, you can run:

BASH
sudo systemctl reset-failed eatmem.service
Click to expand and view more

If you would like systemd to automatically garbage-collect the transient unit, run the command with the --collect flag:

BASH
systemd-run --collect -u eatmem -p CPUQuota=20% -p MemoryMax=1G python ~/spin_loop.py
Click to expand and view more

Just like that, we have prevented the process from going wild. Very cool!

Configuring a Cgroup (Persistent Setup)

There is one more thing, if we ever wanted to save this configuration, that is a Cgroup that monitors and limits the CPU quota and memory max to those that we specify, we would need to define a slice as otherwise the Cgroup would be tied to the process that was invoked in it. We can achieve that with the following.

sliceconfig
[Slice]
CPUQuota=20%
MemoryMax=1G
Click to expand and view more
BASH
sudo cp sliceconfig /etc/systemd/system/eatmem.slice
Click to expand and view more

Now we can place any process into our eatmem.slice as we did before.

BASH
systemd-run -u eatmem --slice=eatmem.slice python ~/spin_loop.py
Click to expand and view more

If we add more processes, their cumulative CPU and memory usage will be limited according to our slice configuration.

Byte Bye

Cgroups underpin one of the most powerful and important technologies in containerization. Managing cgroupfs via systemd is a lower-level abstraction, albeit important to understand, as container managers like Containerd leverage systemd as a cgroup driver to control container resource usage. This ensures the system is protected from resource-hungry processes while distributing compute fairly.

Did you know about Cgroups?

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut