The oom-killer generally has a bad reputation among Linux users. This may be part of the reason Linux invokes it only when it has absolutely no other choice. It will swap out the desktop environment, drop the whole page cache and empty every buffer before it will ultimately kill a process. At least that's what I think that it will do. I have yet to be patient enough to wait for it, sitting in front of an unresponsive system.
This made me and other people wonder if the oom-killer could be configured to step in earlier: reddit r/linux, superuser.com, unix.stackexchange.com.
As it turns out, no, it can't. At least using the in-kernel oom-killer. In the user space, however, we can do whatever we want.
earlyoom wants to be simple and solid. It is written in pure C with no dependencies. An extensive test suite (unit- and integration tests) is written in Go.
earlyoom checks the amount of available memory and free swap up to 10
times a second (less often if there is a lot of free memory).
By default if both are below 10%, it will kill the largest process (highest oom_score
).
The percentage value is configurable via command line
arguments.
In the free -m
output below, the available memory is 2170 MiB and
the free swap is 231 MiB.
total used free shared buff/cache available Mem: 7842 4523 137 841 3182 2170 Swap: 1023 792 231
Why is "available" memory checked as opposed to "free" memory? On a healthy Linux system, "free" memory is supposed to be close to zero, because Linux uses all available physical memory to cache disk access. These caches can be dropped any time the memory is needed for something else.
The "available" memory accounts for that. It sums up all memory that is unused or can be freed immediately.
Note that you need a recent version offree
and Linux kernel 3.14+ to see the "available" column. If you have
a recent kernel, but an old version of free
, you can get the value
from grep MemAvailable /proc/meminfo
.
When both your available memory and free swap drop below 10% of the total memory available
to userspace processes (=total-shared),
it will send the SIGTERM
signal to the process that uses the most memory in the opinion of
the kernel (/proc/*/oom_score
).
nohang, a similar project like earlyoom, written in Python and with additional features and configuration options.
facebooks's pressure stall information (psi) kernel patches and the accompanying oomd userspace helper. The patches are merged in Linux 4.20.
earlyoom does not use echo f > /proc/sysrq-trigger
because:
In some kernel versions (tested on v4.0.5), triggering the kernel oom killer manually does not work at all. That is, it may only free some graphics memory (that will be allocated immediately again) and not actually kill any process. Here you can see how this looks like on my machine (Intel integrated graphics).
This problem has been fixed in Linux v5.17 (commit f530243a) .
Like the Linux kernel would, earlyoom finds its victim by reading through /proc/*/oom_score
.
About 2 MiB
(VmRSS
), though only 220 kiB
is private memory (RssAnon
).
The rest is the libc library (RssFile
) that is shared with other processes.
All memory is locked using mlockall()
to make sure earlyoom does not slow down in low memory situations.
Compiling yourself is easy:
git clone https://github.com/rfjakob/earlyoom.gitcd earlyoom make
Optional: Run the integrated self-tests:
make test
Start earlyoom automatically by registering it as a service:
sudo make install # systemdsudo make install-initscript # non-systemd
Note that for systems with SELinux disabled (Ubuntu 19.04, Debian 9 ...) chcon warnings reporting failure to set the context can be safely ignored.
For Debian 10+ and Ubuntu 18.04+, there's a Debian package:
sudo apt install earlyoom
For Fedora and RHEL 8 with EPEL, there's a Fedora package:
sudo dnf install earlyoom sudo systemctl enable --now earlyoom
For Arch Linux, there's an Arch Linux package:
sudo pacman -S earlyoom sudo systemctl enable --now earlyoom
Availability in other distributions: see repology page.
Just start the executable you have just compiled:
./earlyoom
It will inform you how much memory and swap you have, what the minimum is, how much memory is available and how much swap is free.
./earlyoom eearlyoom v1.8 mem total: 23890 MiB, user mem total: 21701 MiB, swap total: 8191 MiB sending SIGTERM when mem avail <= 10.00% and swap free <= 10.00%, SIGKILL when mem avail <= 5.00% and swap free <= 5.00% mem avail: 20012 of 21701 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%) mem avail: 20031 of 21721 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%) mem avail: 20033 of 21723 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%) [...]
If the values drop below the minimum, processes are killed until it is above the minimum again. Every action is logged to stderr. If you are running earlyoom as a systemd service, you can view the last 10 lines using
systemctl status earlyoom
In order to see earlyoom
in action, create/simulate a memory leak and let earlyoom
do what it does:
tail /dev/zero
If you need any further actions after a process is killed by earlyoom
(such as sending emails), you can parse the logs by:
sudo journalctl -u earlyoom | grep sending
Example output for above test command (tail /dev/zero
) will look like:
Feb 20 10:59:34 debian earlyoom[10231]: sending SIGTERM to process 7378 uid 1000 "tail": oom_score 156, VmRSS 4962 MiB
For older versions of
earlyoom
, use:sudo journalctl -u earlyoom | grep -iE "(sending|killing)"
Since version 1.6, earlyoom can send notifications about killed processes
via the system d-bus. Pass -n
to enable them.
To actually see the notifications in your GUI session, you need to have systembus-notify running as your user.
Additionally, earlyoom can execute a script for each process killed, providing
information about the process via the EARLYOOM_PID
, EARLYOOM_UID
andEARLYOOM_NAME
environment variables. Pass -N /path/to/script
to enable.
Warning: In case of dryrun mode, the script will be executed in rapid succession, ensure you have some sort of rate-limit implemented.
The command-line flag --prefer
specifies processes to prefer killing;
likewise, --avoid
specifies
processes to avoid killing. See https://github.com/rfjakob/earlyoom/blob/master/MANPAGE.md#--prefer-regex for details.
If you are running earlyoom as a system service (through systemd or init.d), you can adjust its configuration via the file provided in /etc/default/earlyoom
. The file already contains some examples in the comments, which you can use to build your own set of configuration based on the supported command line options, for example:
EARLYOOM_ARGS="-m 5 -r 60 --avoid '(^|/)(init|Xorg|ssh)$' --prefer '(^|/)(java|chromium)$'"
After adjusting the file, simply restart the service to apply the changes. For example, for systemd:
systemctl restart earlyoom
Please note that this configuration file has no effect on earlyoom instances outside of systemd/init.d.
earlyoom v1.8 Usage: ./earlyoom [OPTION]... -m PERCENT[,KILL_PERCENT] set available memory minimum to PERCENT of total (default 10 %). earlyoom sends SIGTERM once below PERCENT, then SIGKILL once below KILL_PERCENT (default PERCENT/2). -s PERCENT[,KILL_PERCENT] set free swap minimum to PERCENT of total (default 10 %). Note: both memory and swap must be below minimum for earlyoom to act. -M SIZE[,KILL_SIZE] set available memory minimum to SIZE KiB -S SIZE[,KILL_SIZE] set free swap minimum to SIZE KiB -n enable d-bus notifications -N /PATH/TO/SCRIPT call script after oom kill -g kill all processes within a process group -d, --debug enable debugging messages -v print version information and exit -r INTERVAL memory report interval in seconds (default 1), set to 0 to disable completely -p set niceness of earlyoom to -20 and oom_score_adj to -100 --ignore-root-user do not kill processes owned by root --sort-by-rss find process with the largest rss (default oom_score) --prefer REGEX prefer to kill processes matching REGEX --avoid REGEX avoid killing processes matching REGEX --ignore REGEX ignore processes matching REGEX --dryrun dry run (do not kill any processes) --syslog use syslog instead of std streams -h, --help this help text
See the man page for details.
Bug reports and pull requests are welcome via github. In particular, I am glad to accept
Use case reports and feedback
We don't use procps/libproc2 because procps_pids_select(), for some reason, always parses /proc/$pid/status. This is relatively expensive, and we don't need it.
v1.8.2, 2024-05-07
Add process_mrelease
to allowed syscalls (commit)
Fix IPAddressDeny
syntax (commit)
Allow -p
(commit)
Fixes in earlyoom.service
systemd unit file
v1.8.1, 2024-04-17
Fix trivial test failures caused by message rewording (commit)
v1.8, 2024-04-15
Introduce user mem total
/ meminfo_t.UserMemTotal
and calculate MemAvailablePercent based on it
(commit,
more info in man page)
Use process_mrelease
(#266)
Support NO_COLOR
(https://no-color.org/)
Don't get confused by processes with a zombie main thread (commit)
Add --sort-by-rss
, thanks @RanHuang! This will select a process to kill acc. to the largest RSS
instead of largest oom_score.
The Gitlab CI testsuite now also runs on Amazon Linux 2 and Oracle Linux 7.
v1.7, 2022-03-05
Add -N
flag to run a script every time a process is killed (commit,
man page section)
Add -g
flag to kill whole process group (#247)
Remove -i
flag (ignored for compatibility), it does
not work properly on Linux kernels 5.9+ (#234)
Hardening: Drop ambient capabilities on startup (#234)
v1.6.2, 2020-10-14
Double-check memory situation before killing victim (commit)
Never terminate ourselves (#205)
Dump buffer on /proc/meminfo conversion error (#214)
1.6.1, 2020-07-07
Clean up dbus-send zombie processes (#200)
Skip processes with oom_score_adj=-1000 (210)
1.6, 2020-04-11
-n
/-N
now enables the new logic
You need to have systembus-notify running in your GUI session for notifications for work
Replace old notify-send
GUI notification logic withdbus-send
/ systembus-notify
(#183)
Handle /proc
mounted with
hidepid
gracefully (issue #184)
v1.5, 2020-03-22
-p
: set oom_score_adj to -100
instead of -1000
(#170)
Allow using both -M
and -m
, and -S
and -s
. The
lower value (converted to percentages) will be used.
Set memory report interval in earlyoom.default
to 1 hour
instead of 1 minute (#177)
v1.4, 2020-03-01
Use block-local variables where possible
Introduce PATH_LEN to replace several hardcoded buffer lengths
Make victim selection logic 50% faster by lazy-loading process attributes
Log the user id uid
of killed processes in addition to pid and name
Color debug log in light grey
Code clean-up
Expand testsuite (make test
)
Run cppcheck
when available
Add unit-test benchmarks (make bench
)
Drop root privileges in systemd unit file earlyoom.service
v1.3.1, 2020-02-27
Fix spurious testsuite failure on systems with a lot of RAM (issue #156)
v1.3, 2019-05-26
Don't exit with a fatal error if SIGTERM limit < SIGKILL limit
Allow zero SIGKILL limit
This fixes the problem that earlyoom sometimes kills more than one process when one would be enough (issue #121)
Wait for processes to actually exit when sending a signal
Be more liberal in what limits to accepts for SIGTERM and SIGKILL (issue #97)
Reformat startup output to make it clear that BOTH swap and mem must be <= limit
Add notify_all_users.py helper script
Add CODE_OF_CONDUCT.md (Contributor Covenant 1.4) (#102)
Fix possibly truncated UTF8 app names in log output (#110)
v1.2, 2018-10-28
Implement adaptive sleep time (= adaptive poll rate) to lower CPU usage further (issue #61)
Remove option to use kernel oom-killer (-k
, now ignored for compatibility)
(issue #80)
Gracefully handle the case of swap being added or removed after earlyoom was started (issue 62, commit)
Implement staged kill: first SIGTERM, then SIGKILL, with configurable limits (issue #67)
v1.1, 2018-07-07
Fix possible shell code injection through GUI notifications (commit)
On failure to kill any process, only sleep 1 second instead of 10 (issue #74)
Send the GUI notification after killing, not before (issue #73)
Accept --help
in addition to -h
Fix wrong process name in log and in kill notification (commit 1, commit 2, issue #52, issue #65, issue #194)
Fix possible division by zero with -S
(commit)
v1.0, 2018-01-28
Add --prefer
and --avoid
options (@TomJohnZ)
Add support for GUI notifications, add options -n
and -N
v0.12: Add -M
and -S
options (@nailgun); add man page, parameterize Makefile (@yangfl)
v0.11: Fix undefined behavior in get_entry_fatal (missing return, commit)
v0.10: Allow to override Makefile's VERSION variable to make packaging easier,
add -v
command-line option
v0.9: If oom_score of all processes is 0, use VmRss to find a victim
v0.8: Use a guesstimate if the kernel does not provide MemAvailable
v0.7: Select victim by oom_score instead of VmRSS, add options -i
and -d
v0.6: Add command-line options -m
, -s
, -k
v0.5: Add swap support
v0.4: Add SysV init script (thanks @joeytwiddle), use the new MemAvailable
from /proc/meminfo
(needs Linux 3.14+, commit)
v0.2: Add systemd unit file
v0.1: Initial release