1
0
mirror of https://github.com/systemd/systemd synced 2026-04-14 19:15:04 +02:00

Compare commits

..

10 Commits

Author SHA1 Message Date
Frantisek Sumsal
d3a1710bc2 sd-varlink: fix a potential connection count leak
With the old version there was a potential connection count leak if
either of the two hashmap operations in count_connection() failed. In
that case we'd return from sd_varlink_server_add_connection_pair()
_before_ attached the sd_varlink_server object to an sd_varlink object,
and since varlink_detach_server() is the only place where the connection
counter is decremented (called through sd_varlink_close() in various
error paths later _if_ the "server" object is not null, i.e. attached to
the sd_varlink object) we'd "leak" a connection every time this
happened. However, the potential of abusing this is very theoretical,
as one would need to hit OOM every time either of the hashmap operations
was executed for a while before exhausting the connection limit.

Let's just increment the connection counter after any potential error
path, so we don't have to deal with potential rollbacks.
2026-04-09 22:05:25 +01:00
Milan Kyselica
4b32ab5a36 udev: fix bounds check in dev_if_packed_info()
The check compared bLength against (size - sizeof(descriptor)), which
is an absolute limit unrelated to the current buffer position. Since
bLength is uint8_t (max 255), this can never exceed size - 9 for any
realistic input, making the check dead code.

Use (size - pos) instead so the check actually catches descriptors
that extend past the end of the read data.

Fixes: https://github.com/systemd/systemd/issues/41570
2026-04-09 21:48:09 +01:00
Daan De Meyer
5ade3f6a01 docs: Fix window in PRESSURE.md 2026-04-09 22:47:10 +02:00
Daan De Meyer
158f2d50bf docs: Update MEMORY_PRESSURE.md => PRESSURE.md
Make the doc more generic and mention all pressure types, not just
memory.
2026-04-09 22:47:10 +02:00
Daan De Meyer
594659da06 core: Add I/O pressure support 2026-04-09 22:47:10 +02:00
Daan De Meyer
316d17fcbd core: Add support for CPU pressure notifications
Works the same way as memory pressure notifications. Code is refactored
to work on enum arrays to reduce duplication.
2026-04-09 22:47:10 +02:00
Daan De Meyer
b3bb4fefde test-mempress: Support unprivileged operation 2026-04-09 22:47:10 +02:00
Daan De Meyer
1121c77526 test-mempress: Migrate to new assertion macros 2026-04-09 22:47:10 +02:00
Daan De Meyer
f08796065d compress: consolidate all compression into compress.c with dlopen
Move the push-based streaming compression API from import-compress.c
into compress.c and delete import-compress.c/h. This consolidates all
compression code in one place and makes all compression libraries
(liblzma, liblz4, libzstd, libz, libbz2) runtime-loaded via dlopen
instead of directly linked.

Introduce opaque Compressor/Decompressor types backed by a heap-
allocated struct defined only in compress.c, keeping all third-party
library headers out of compress.h.

Rewrite the per-codec fd-to-fd stream functions as thin wrappers around
the push API via generic compress_stream()/decompress_stream() taking a
Compression type parameter. Integrate LZ4 into this framework using the
LZ4 Frame API, eliminating all LZ4 special-casing.

Extend the Compression enum with COMPRESSION_GZIP and COMPRESSION_BZIP2
and add the corresponding blob, startswith, and stream functions for
both.

Rename the ImportCompress types and functions: ImportCompressType becomes
the existing Compression enum, ImportCompress becomes Compressor (with
Decompressor typedef), and all import_compress_*/import_uncompress_*
become compressor_*/decompressor_*. Rename dlopen_lzma() to dlopen_xz()
for consistency. Make compression_to_string() return lowercase by
default.

Add INT_MAX/UINT_MAX overflow checks for LZ4, zlib, and bzip2 blob
functions where the codec API uses narrower integer types than our
uint64_t parameters.

Migrate test-compress.c and test-compress-benchmark.c to the TEST()
macro framework, new assertion macros, and codec-generic loops instead
of per-codec duplication.

Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 22:39:11 +02:00
Frantisek Sumsal
44d0f273fa portablectl: fix swapped arguments for setns()
Follow-up for 824fcb95c9e66abe6b350ebab6e0593498ff7aa1.
2026-04-09 20:43:42 +01:00
81 changed files with 4697 additions and 3044 deletions

View File

@ -1,241 +1,4 @@
--- ---
title: Memory Pressure Handling layout: forward
category: Interfaces target: /PRESSURE
layout: default
SPDX-License-Identifier: LGPL-2.1-or-later
--- ---
# Memory Pressure Handling in systemd
When the system is under memory pressure (i.e. some component of the OS
requires memory allocation but there is only very little or none available),
it can attempt various things to make more memory available again ("reclaim"):
* The kernel can flush out memory pages backed by files on disk, under the
knowledge that it can reread them from disk when needed again. Candidate
pages are the many memory mapped executable files and shared libraries on
disk, among others.
* The kernel can flush out memory pages not backed by files on disk
("anonymous" memory, i.e. memory allocated via `malloc()` and similar calls,
or `tmpfs` file system contents) if there's swap to write it to.
* Userspace can proactively release memory it allocated but doesn't immediately
require back to the kernel. This includes allocation caches, and other forms
of caches that are not required for normal operation to continue.
The latter is what we want to focus on in this document: how to ensure
userspace process can detect mounting memory pressure early and release memory
back to the kernel as it happens, relieving the memory pressure before it
becomes too critical.
The effects of memory pressure during runtime generally are growing latencies
during operation: when a program requires memory but the system is busy writing
out memory to (relatively slow) disks in order make some available, this
generally surfaces in scheduling latencies, and applications and services will
slow down until memory pressure is relieved. Hence, to ensure stable service
latencies it is essential to release unneeded memory back to the kernel early
on.
On Linux the [Pressure Stall Information
(PSI)](https://docs.kernel.org/accounting/psi.html) Linux kernel interface is
the primary way to determine the system or a part of it is under memory
pressure. PSI makes available to userspace a `poll()`-able file descriptor that
gets notifications whenever memory pressure latencies for the system or a
control group grow beyond some level.
`systemd` itself makes use of PSI, and helps applications to do so too.
Specifically:
* Most of systemd's long running components watch for PSI memory pressure
events, and release allocation caches and other resources once seen.
* systemd's service manager provides a protocol for asking services to monitor
PSI events and configure the appropriate pressure thresholds.
* systemd's `sd-event` event loop API provides a high-level call
`sd_event_add_memory_pressure()` enabling programs using it to efficiently
hook into the PSI memory pressure protocol provided by the service manager,
with very few lines of code.
## Memory Pressure Service Protocol
If memory pressure handling for a specific service is enabled via
`MemoryPressureWatch=` the memory pressure service protocol is used to tell the
service code about this. Specifically two environment variables are set by the
service manager, and typically consumed by the service:
* The `$MEMORY_PRESSURE_WATCH` environment variable will contain an absolute
path in the file system to the file to watch for memory pressure events. This
will usually point to a PSI file such as the `memory.pressure` file of the
service's cgroup. In order to make debugging easier, and allow later
extension it is recommended for applications to also allow this path to refer
to an `AF_UNIX` stream socket in the file system or a FIFO inode in the file
system. Regardless of which of the three types of inodes this absolute path
refers to, all three are `poll()`-able for memory pressure events. The
variable can also be set to the literal string `/dev/null`. If so the service
code should take this as indication that memory pressure monitoring is not
desired and should be turned off.
* The `$MEMORY_PRESSURE_WRITE` environment variable is optional. If set by the
service manager it contains Base64 encoded data (that may contain arbitrary
binary values, including NUL bytes) that should be written into the path
provided via `$MEMORY_PRESSURE_WATCH` right after opening it. Typically, if
talking directly to a PSI kernel file this will contain information about the
threshold settings configurable in the service manager.
When a service initializes it hence should look for
`$MEMORY_PRESSURE_WATCH`. If set, it should try to open the specified path. If
it detects the path to refer to a regular file it should assume it refers to a
PSI kernel file. If so, it should write the data from `$MEMORY_PRESSURE_WRITE`
into the file descriptor (after Base64-decoding it, and only if the variable is
set) and then watch for `POLLPRI` events on it. If it detects the paths refers
to a FIFO inode, it should open it, write the `$MEMORY_PRESSURE_WRITE` data
into it (as above) and then watch for `POLLIN` events on it. Whenever `POLLIN`
is seen it should read and discard any data queued in the FIFO. If the path
refers to an `AF_UNIX` socket in the file system, the application should
`connect()` a stream socket to it, write `$MEMORY_PRESSURE_WRITE` into it (as
above) and watch for `POLLIN`, discarding any data it might receive.
To summarize:
* If `$MEMORY_PRESSURE_WATCH` points to a regular file: open and watch for
`POLLPRI`, never read from the file descriptor.
* If `$MEMORY_PRESSURE_WATCH` points to a FIFO: open and watch for `POLLIN`,
read/discard any incoming data.
* If `$MEMORY_PRESSURE_WATCH` points to an `AF_UNIX` socket: connect and watch
for `POLLIN`, read/discard any incoming data.
* If `$MEMORY_PRESSURE_WATCH` contains the literal string `/dev/null`, turn off
memory pressure handling.
(And in each case, immediately after opening/connecting to the path, write the
decoded `$MEMORY_PRESSURE_WRITE` data into it.)
Whenever a `POLLPRI`/`POLLIN` event is seen the service is under memory
pressure. It should use this as hint to release suitable redundant resources,
for example:
* glibc's memory allocation cache, via
[`malloc_trim()`](https://man7.org/linux/man-pages/man3/malloc_trim.3.html). Similar,
allocation caches implemented in the service itself.
* Any other local caches, such DNS caches, or web caches (in particular if
service is a web browser).
* Terminate any idle worker threads or processes.
* Run a garbage collection (GC) cycle, if the runtime environment supports it.
* Terminate the process if idle, and can be automatically started when
needed next.
Which actions precisely to take depends on the service in question. Note that
the notifications are delivered when memory allocation latency already degraded
beyond some point. Hence when discussing which resources to keep and which to
discard, keep in mind it's typically acceptable that latencies incurred
recovering discarded resources at a later point are acceptable, given that
latencies *already* are affected negatively.
In case the path supplied via `$MEMORY_PRESSURE_WATCH` points to a PSI kernel
API file, or to an `AF_UNIX` opening it multiple times is safe and reliable,
and should deliver notifications to each of the opened file descriptors. This
is specifically useful for services that consist of multiple processes, and
where each of them shall be able to release resources on memory pressure.
The `POLLPRI`/`POLLIN` conditions will be triggered every time memory pressure
is detected, but not continuously. It is thus safe to keep `poll()`-ing on the
same file descriptor continuously, and executing resource release operations
whenever the file descriptor triggers without having to expect overloading the
process.
(Currently, the protocol defined here only allows configuration of a single
"degree" of memory pressure, there's no distinction made on how strong the
pressure is. In future, if it becomes apparent that there's clear need to
extend this we might eventually add different degrees, most likely by adding
additional environment variables such as `$MEMORY_PRESSURE_WRITE_LOW` and
`$MEMORY_PRESSURE_WRITE_HIGH` or similar, which may contain different settings
for lower or higher memory pressure thresholds.)
## Service Manager Settings
The service manager provides two per-service settings that control the memory
pressure handling:
* The
[`MemoryPressureWatch=`](https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html#MemoryPressureWatch=)
setting controls whether to enable the memory pressure protocol for the
service in question.
* The `MemoryPressureThresholdSec=` setting allows configuring the threshold
when to signal memory pressure to the services. It takes a time value
(usually in the millisecond range) that defines a threshold per 1s time
window: if memory allocation latencies grow beyond this threshold
notifications are generated towards the service, requesting it to release
resources.
The `/etc/systemd/system.conf` file provides two settings that may be used to
select the default values for the above settings. If the threshold isn't
configured via the per-service nor system-wide option, it defaults to 100ms.
When memory pressure monitoring is enabled for a service via
`MemoryPressureWatch=` this primarily does three things:
* It enables cgroup memory accounting for the service (this is a requirement
for per-cgroup PSI)
* It sets the aforementioned two environment variables for processes invoked
for the service, based on the control group of the service and provided
settings.
* The `memory.pressure` PSI control group file associated with the service's
cgroup is delegated to the service (i.e. permissions are relaxed so that
unprivileged service payload code can open the file for writing).
## Memory Pressure Events in `sd-event`
The
[`sd-event`](https://www.freedesktop.org/software/systemd/man/latest/sd-event.html)
event loop library provides two API calls that encapsulate the
functionality described above:
* The
[`sd_event_add_memory_pressure()`](https://www.freedesktop.org/software/systemd/man/latest/sd_event_add_memory_pressure.html)
call implements the service-side of the memory pressure protocol and
integrates it with an `sd-event` event loop. It reads the two environment
variables, connects/opens the specified file, writes the specified data to it,
then watches it for events.
* The `sd_event_trim_memory()` call may be called to trim the calling
processes' memory. It's a wrapper around glibc's `malloc_trim()`, but first
releases allocation caches maintained by libsystemd internally. This function
serves as the default when a NULL callback is supplied to
`sd_event_add_memory_pressure()`.
When implementing a service using `sd-event`, for automatic memory pressure
handling, it's typically sufficient to add a line such as:
```c
(void) sd_event_add_memory_pressure(event, NULL, NULL, NULL);
```
right after allocating the event loop object `event`.
## Other APIs
Other programming environments might have native APIs to watch memory
pressure/low memory events. Most notable is probably GLib's
[GMemoryMonitor](https://docs.gtk.org/gio/iface.MemoryMonitor.html). As of GLib
2.86.0, it uses the per-cgroup PSI kernel file to monitor for memory pressure,
but does not yet read the environment variables recommended above.
In older versions, it used the per-system Linux PSI interface as the backend, but operated
differently than the above: memory pressure events were picked up by a system
service, which then propagated this through D-Bus to the applications. This was
typically less than ideal, since this means each notification event had to
traverse three processes before being handled. This traversal created
additional latencies at a time where the system is already experiencing adverse
latencies. Moreover, it focused on system-wide PSI events, even though
service-local ones are generally the better approach.

255
docs/PRESSURE.md Normal file
View File

@ -0,0 +1,255 @@
---
title: Resource Pressure Handling
category: Interfaces
layout: default
SPDX-License-Identifier: LGPL-2.1-or-later
---
# Resource Pressure Handling in systemd
On Linux the [Pressure Stall Information
(PSI)](https://docs.kernel.org/accounting/psi.html) Linux kernel interface
provides a way to monitor resource pressure — situations where tasks are
stalled waiting for a resource to become available. PSI covers three types of
resources:
* **Memory pressure**: tasks are stalled because the system is low on memory
and the kernel is busy reclaiming it (e.g. writing out pages to swap or
flushing file-backed pages).
* **CPU pressure**: tasks are stalled waiting for CPU time because the CPU is
oversubscribed.
* **IO pressure**: tasks are stalled waiting for IO operations to complete
because the IO subsystem is saturated.
PSI makes available to userspace a `poll()`-able file descriptor that gets
notifications whenever pressure latencies for the system or a control group
grow beyond some configured level.
When the system is under memory pressure, userspace can proactively release
memory it allocated but doesn't immediately require back to the kernel. This
includes allocation caches, and other forms of caches that are not required for
normal operation to continue. Similarly, when CPU or IO pressure is detected,
services can take appropriate action such as reducing parallelism, deferring
background work, or shedding load.
The effects of resource pressure during runtime generally are growing latencies
during operation: applications and services slow down until pressure is
relieved. Hence, to ensure stable service latencies it is essential to detect
pressure early and respond appropriately.
`systemd` itself makes use of PSI, and helps applications to do so too.
Specifically:
* Most of systemd's long running components watch for PSI memory pressure
events, and release allocation caches and other resources once seen.
* systemd's service manager provides a protocol for asking services to monitor
PSI events and configure the appropriate pressure thresholds, for memory, CPU,
and IO pressure independently.
* systemd's `sd-event` event loop API provides high-level calls
`sd_event_add_memory_pressure()`, `sd_event_add_cpu_pressure()`, and
`sd_event_add_io_pressure()` enabling programs using it to efficiently hook
into the PSI pressure protocol provided by the service manager, with very few
lines of code.
## Pressure Service Protocol
For each resource type, if pressure handling for a specific service is enabled
via the corresponding `*PressureWatch=` setting (i.e. `MemoryPressureWatch=`,
`CPUPressureWatch=`, or `IOPressureWatch=`), two environment variables are set
by the service manager:
* `$MEMORY_PRESSURE_WATCH` / `$CPU_PRESSURE_WATCH` / `$IO_PRESSURE_WATCH`
contains an absolute path in the file system to the file to watch for
pressure events. This will usually point to a PSI file such as the
`memory.pressure`, `cpu.pressure`, or `io.pressure` file of the service's
cgroup. In order to make debugging easier, and allow later extension it is
recommended for applications to also allow this path to refer to an `AF_UNIX`
stream socket in the file system or a FIFO inode in the file system.
Regardless of which of the three types of inodes this absolute path refers
to, all three are `poll()`-able for pressure events. The variable can also be
set to the literal string `/dev/null`. If so the service code should take this
as indication that pressure monitoring for this resource is not desired and
should be turned off.
* `$MEMORY_PRESSURE_WRITE` / `$CPU_PRESSURE_WRITE` / `$IO_PRESSURE_WRITE`
optional. If set by the service manager it contains Base64 encoded data (that
may contain arbitrary binary values, including NUL bytes) that should be
written into the path provided via the corresponding `*_PRESSURE_WATCH`
variable right after opening it. Typically, if talking directly to a PSI
kernel file this will contain information about the threshold settings
configurable in the service manager.
The protocol works the same for all three resource types. The remainder of this
section uses memory pressure as the example, but the same logic applies to CPU
and IO pressure with the corresponding environment variable names.
When a service initializes it hence should look for
`$MEMORY_PRESSURE_WATCH`. If set, it should try to open the specified path. If
it detects the path to refer to a regular file it should assume it refers to a
PSI kernel file. If so, it should write the data from `$MEMORY_PRESSURE_WRITE`
into the file descriptor (after Base64-decoding it, and only if the variable is
set) and then watch for `POLLPRI` events on it. If it detects the path refers
to a FIFO inode, it should open it, write the `$MEMORY_PRESSURE_WRITE` data
into it (as above) and then watch for `POLLIN` events on it. Whenever `POLLIN`
is seen it should read and discard any data queued in the FIFO. If the path
refers to an `AF_UNIX` socket in the file system, the application should
`connect()` a stream socket to it, write `$MEMORY_PRESSURE_WRITE` into it (as
above) and watch for `POLLIN`, discarding any data it might receive.
To summarize:
* If `$MEMORY_PRESSURE_WATCH` points to a regular file: open and watch for
`POLLPRI`, never read from the file descriptor.
* If `$MEMORY_PRESSURE_WATCH` points to a FIFO: open and watch for `POLLIN`,
read/discard any incoming data.
* If `$MEMORY_PRESSURE_WATCH` points to an `AF_UNIX` socket: connect and watch
for `POLLIN`, read/discard any incoming data.
* If `$MEMORY_PRESSURE_WATCH` contains the literal string `/dev/null`, turn off
memory pressure handling.
(And in each case, immediately after opening/connecting to the path, write the
decoded `$MEMORY_PRESSURE_WRITE` data into it.)
Whenever a `POLLPRI`/`POLLIN` event is seen the service is under pressure. It
should use this as hint to release suitable redundant resources, for example:
* glibc's memory allocation cache, via
[`malloc_trim()`](https://man7.org/linux/man-pages/man3/malloc_trim.3.html). Similarly,
allocation caches implemented in the service itself.
* Any other local caches, such as DNS caches, or web caches (in particular if
service is a web browser).
* Terminate any idle worker threads or processes.
* Run a garbage collection (GC) cycle, if the runtime environment supports it.
* Terminate the process if idle, and can be automatically started when
needed next.
Which actions precisely to take depends on the service in question and the type
of pressure. Note that the notifications are delivered when resource latency
already degraded beyond some point. Hence when discussing which resources to
keep and which to discard, keep in mind it's typically acceptable that latencies
incurred recovering discarded resources at a later point are acceptable, given
that latencies *already* are affected negatively.
In case the path supplied via `$MEMORY_PRESSURE_WATCH` points to a PSI kernel
API file, or to an `AF_UNIX` socket, opening it multiple times is safe and reliable,
and should deliver notifications to each of the opened file descriptors. This
is specifically useful for services that consist of multiple processes, and
where each of them shall be able to release resources on memory pressure.
The `POLLPRI`/`POLLIN` conditions will be triggered every time pressure is
detected, but not continuously. It is thus safe to keep `poll()`-ing on the
same file descriptor continuously, and executing resource release operations
whenever the file descriptor triggers without having to expect overloading the
process.
(Currently, the protocol defined here only allows configuration of a single
"degree" of pressure per resource type, there's no distinction made on how
strong the pressure is. In future, if it becomes apparent that there's clear
need to extend this we might eventually add different degrees, most likely by
adding additional environment variables such as `$MEMORY_PRESSURE_WRITE_LOW`
and `$MEMORY_PRESSURE_WRITE_HIGH` or similar, which may contain different
settings for lower or higher pressure thresholds.)
## Service Manager Settings
The service manager provides two per-service settings for each resource type
that control pressure handling:
* `MemoryPressureWatch=` / `CPUPressureWatch=` / `IOPressureWatch=` controls
whether to enable the pressure protocol for the respective resource type for
the service in question. See
[`systemd.resource-control(5)`](https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html#MemoryPressureWatch=)
for details.
* `MemoryPressureThresholdSec=` / `CPUPressureThresholdSec=` /
`IOPressureThresholdSec=` allows configuring the threshold when to signal
pressure to the services. It takes a time value (usually in the millisecond
range) that defines a threshold per 2s time window: if resource latencies grow
beyond this threshold notifications are generated towards the service,
requesting it to release resources.
The `/etc/systemd/system.conf` file provides two settings for each resource
type that may be used to select the default values for the above settings. If
the threshold isn't configured via the per-service nor system-wide option, it
defaults to 200ms.
When pressure monitoring is enabled for a service this primarily does three
things:
* It enables the corresponding cgroup accounting for the service (this is a
requirement for per-cgroup PSI).
* It sets the aforementioned two environment variables for processes invoked
for the service, based on the control group of the service and provided
settings.
* The corresponding PSI control group file (`memory.pressure`, `cpu.pressure`,
or `io.pressure`) associated with the service's cgroup is delegated to the
service (i.e. permissions are relaxed so that unprivileged service payload
code can open the file for writing).
## Pressure Events in `sd-event`
The
[`sd-event`](https://www.freedesktop.org/software/systemd/man/latest/sd-event.html)
event loop library provides API calls that encapsulate the functionality
described above:
* [`sd_event_add_memory_pressure()`](https://www.freedesktop.org/software/systemd/man/latest/sd_event_add_memory_pressure.html),
`sd_event_add_cpu_pressure()`, and `sd_event_add_io_pressure()` implement the
service-side of the pressure protocol for each resource type and integrate it
with an `sd-event` event loop. Each reads the corresponding two environment
variables, connects/opens the specified file, writes the specified data to it,
then watches it for events.
* The `sd_event_trim_memory()` call may be called to trim the calling
processes' memory. It's a wrapper around glibc's `malloc_trim()`, but first
releases allocation caches maintained by libsystemd internally. This function
serves as the default when a NULL callback is supplied to
`sd_event_add_memory_pressure()`. Note that the default handler for
`sd_event_add_cpu_pressure()` and `sd_event_add_io_pressure()` is a no-op;
a custom callback should be provided for CPU and IO pressure to take
meaningful action.
When implementing a service using `sd-event`, for automatic memory pressure
handling, it's typically sufficient to add a line such as:
```c
(void) sd_event_add_memory_pressure(event, NULL, NULL, NULL);
```
right after allocating the event loop object `event`. For CPU and IO pressure,
a custom handler should be provided to take appropriate action:
```c
(void) sd_event_add_cpu_pressure(event, NULL, my_cpu_pressure_handler, userdata);
(void) sd_event_add_io_pressure(event, NULL, my_io_pressure_handler, userdata);
```
## Other APIs
Other programming environments might have native APIs to watch memory
pressure/low memory events. Most notable is probably GLib's
[GMemoryMonitor](https://docs.gtk.org/gio/iface.MemoryMonitor.html). As of GLib
2.86.0, it uses the per-cgroup PSI kernel file to monitor for memory pressure,
but does not yet read the environment variables recommended above.
In older versions, it used the per-system Linux PSI interface as the backend, but operated
differently than the above: memory pressure events were picked up by a system
service, which then propagated this through D-Bus to the applications. This was
typically less than ideal, since this means each notification event had to
traverse three processes before being handled. This traversal created
additional latencies at a time where the system is already experiencing adverse
latencies. Moreover, it focused on system-wide PSI events, even though
service-local ones are generally the better approach.

View File

@ -62,7 +62,7 @@
Note that this is a privileged option as, even if it has a timeout, is synchronous and delays the kill, Note that this is a privileged option as, even if it has a timeout, is synchronous and delays the kill,
so use with care. so use with care.
The typically preferable mechanism to process memory pressure is to do what The typically preferable mechanism to process memory pressure is to do what
<ulink url="https://systemd.io/MEMORY_PRESSURE/">MEMORY_PRESSURE</ulink> describes which is unprivileged, <ulink url="https://systemd.io/PRESSURE">Resource Pressure Handling</ulink> describes which is unprivileged,
asynchronous and does not delay the kill. asynchronous and does not delay the kill.
</para> </para>

View File

@ -552,6 +552,14 @@ node /org/freedesktop/systemd1 {
readonly t DefaultMemoryPressureThresholdUSec = ...; readonly t DefaultMemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s DefaultMemoryPressureWatch = '...'; readonly s DefaultMemoryPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t DefaultCPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s DefaultCPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t DefaultIOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s DefaultIOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("const") @org.freedesktop.DBus.Property.EmitsChangedSignal("const")
readonly t TimerSlackNSec = ...; readonly t TimerSlackNSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("const") @org.freedesktop.DBus.Property.EmitsChangedSignal("const")
@ -793,6 +801,14 @@ node /org/freedesktop/systemd1 {
<!--property DefaultMemoryPressureWatch is not documented!--> <!--property DefaultMemoryPressureWatch is not documented!-->
<!--property DefaultCPUPressureThresholdUSec is not documented!-->
<!--property DefaultCPUPressureWatch is not documented!-->
<!--property DefaultIOPressureThresholdUSec is not documented!-->
<!--property DefaultIOPressureWatch is not documented!-->
<!--property TimerSlackNSec is not documented!--> <!--property TimerSlackNSec is not documented!-->
<!--property DefaultOOMPolicy is not documented!--> <!--property DefaultOOMPolicy is not documented!-->
@ -1243,6 +1259,14 @@ node /org/freedesktop/systemd1 {
<variablelist class="dbus-property" generated="True" extra-ref="DefaultMemoryPressureWatch"/> <variablelist class="dbus-property" generated="True" extra-ref="DefaultMemoryPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="DefaultCPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="DefaultCPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="DefaultIOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="DefaultIOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="TimerSlackNSec"/> <variablelist class="dbus-property" generated="True" extra-ref="TimerSlackNSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="DefaultOOMPolicy"/> <variablelist class="dbus-property" generated="True" extra-ref="DefaultOOMPolicy"/>
@ -3066,6 +3090,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -3735,6 +3767,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -4427,6 +4467,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -5326,6 +5374,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -6011,6 +6067,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -6677,6 +6741,14 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -7399,6 +7471,14 @@ node /org/freedesktop/systemd1/unit/home_2emount {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -8008,6 +8088,14 @@ node /org/freedesktop/systemd1/unit/home_2emount {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -8582,6 +8670,14 @@ node /org/freedesktop/systemd1/unit/home_2emount {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -9437,6 +9533,14 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -10028,6 +10132,14 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -10584,6 +10696,14 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -11292,6 +11412,14 @@ node /org/freedesktop/systemd1/unit/system_2eslice {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -11465,6 +11593,14 @@ node /org/freedesktop/systemd1/unit/system_2eslice {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -11653,6 +11789,14 @@ node /org/freedesktop/systemd1/unit/system_2eslice {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -11864,6 +12008,14 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope {
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t MemoryPressureThresholdUSec = ...; readonly t MemoryPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s CPUPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t CPUPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly s IOPressureWatch = '...';
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly t IOPressureThresholdUSec = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly a(iiss) NFTSet = [...]; readonly a(iiss) NFTSet = [...];
@org.freedesktop.DBus.Property.EmitsChangedSignal("false") @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly b CoredumpReceive = ...; readonly b CoredumpReceive = ...;
@ -12051,6 +12203,14 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope {
<!--property MemoryPressureThresholdUSec is not documented!--> <!--property MemoryPressureThresholdUSec is not documented!-->
<!--property CPUPressureWatch is not documented!-->
<!--property CPUPressureThresholdUSec is not documented!-->
<!--property IOPressureWatch is not documented!-->
<!--property IOPressureThresholdUSec is not documented!-->
<!--property NFTSet is not documented!--> <!--property NFTSet is not documented!-->
<!--property CoredumpReceive is not documented!--> <!--property CoredumpReceive is not documented!-->
@ -12263,6 +12423,14 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope {
<variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/> <variablelist class="dbus-property" generated="True" extra-ref="MemoryPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="CPUPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureWatch"/>
<variablelist class="dbus-property" generated="True" extra-ref="IOPressureThresholdUSec"/>
<variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/> <variablelist class="dbus-property" generated="True" extra-ref="NFTSet"/>
<variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/> <variablelist class="dbus-property" generated="True" extra-ref="CoredumpReceive"/>
@ -12475,7 +12643,11 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<function>RemoveSubgroupFromUnit()</function>, and <function>RemoveSubgroupFromUnit()</function>, and
<function>KillUnitSubgroup()</function> were added in version 258.</para> <function>KillUnitSubgroup()</function> were added in version 258.</para>
<para><varname>TransactionsWithOrderingCycle</varname> was added in version 259.</para> <para><varname>TransactionsWithOrderingCycle</varname> was added in version 259.</para>
<para><varname>DefaultMemoryZSwapWriteback</varname> was added in version 261.</para> <para><varname>DefaultMemoryZSwapWriteback</varname>,
<varname>DefaultCPUPressureThresholdUSec</varname>,
<varname>DefaultCPUPressureWatch</varname>,
<varname>DefaultIOPressureThresholdUSec</varname>, and
<varname>DefaultIOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Unit Objects</title> <title>Unit Objects</title>
@ -12567,6 +12739,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<varname>ExecReloadPostEx</varname> were added in version 259.</para> <varname>ExecReloadPostEx</varname> were added in version 259.</para>
<para><varname>BindNetworkInterface</varname>, <varname>MemoryTHP</varname>, <para><varname>BindNetworkInterface</varname>, <varname>MemoryTHP</varname>,
<varname>RefreshOnReload</varname>, and <varname>RootMStack</varname> were added in version 260.</para> <varname>RefreshOnReload</varname>, and <varname>RootMStack</varname> were added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Socket Unit Objects</title> <title>Socket Unit Objects</title>
@ -12637,6 +12813,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<varname>ManagedOOMKills</varname> were added in 259.</para> <varname>ManagedOOMKills</varname> were added in 259.</para>
<para><varname>BindNetworkInterface</varname> <varname>MemoryTHP</varname>, and <para><varname>BindNetworkInterface</varname> <varname>MemoryTHP</varname>, and
<varname>RootMStack</varname> were added in version 260.</para> <varname>RootMStack</varname> were added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Mount Unit Objects</title> <title>Mount Unit Objects</title>
@ -12702,6 +12882,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<varname>ManagedOOMKills</varname> were added in 259.</para> <varname>ManagedOOMKills</varname> were added in 259.</para>
<para><varname>BindNetworkInterface</varname> <varname>MemoryTHP</varname>, and <para><varname>BindNetworkInterface</varname> <varname>MemoryTHP</varname>, and
<varname>RootMStack</varname> were added in version 260.</para> <varname>RootMStack</varname> were added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Swap Unit Objects</title> <title>Swap Unit Objects</title>
@ -12765,6 +12949,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<varname>ManagedOOMKills</varname> were added in 259.</para> <varname>ManagedOOMKills</varname> were added in 259.</para>
<para><varname>BindNetworkInterface</varname>, <varname>MemoryTHP</varname>, and <para><varname>BindNetworkInterface</varname>, <varname>MemoryTHP</varname>, and
<varname>RootMStack</varname> were added in version 260.</para> <varname>RootMStack</varname> were added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Slice Unit Objects</title> <title>Slice Unit Objects</title>
@ -12798,6 +12986,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<para><varname>OOMKills</varname>, and <para><varname>OOMKills</varname>, and
<varname>ManagedOOMKills</varname> were added in 259.</para> <varname>ManagedOOMKills</varname> were added in 259.</para>
<para><varname>BindNetworkInterface</varname> was added in version 260.</para> <para><varname>BindNetworkInterface</varname> was added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Scope Unit Objects</title> <title>Scope Unit Objects</title>
@ -12829,6 +13021,10 @@ $ gdbus introspect --system --dest org.freedesktop.systemd1 \
<para><varname>OOMKills</varname>, and <para><varname>OOMKills</varname>, and
<varname>ManagedOOMKills</varname> were added in 259.</para> <varname>ManagedOOMKills</varname> were added in 259.</para>
<para><varname>BindNetworkInterface</varname> was added in version 260.</para> <para><varname>BindNetworkInterface</varname> was added in version 260.</para>
<para><varname>CPUPressureThresholdUSec</varname>,
<varname>CPUPressureWatch</varname>,
<varname>IOPressureThresholdUSec</varname>, and
<varname>IOPressureWatch</varname> were added in version 261.</para>
</refsect2> </refsect2>
<refsect2> <refsect2>
<title>Job Objects</title> <title>Job Objects</title>

View File

@ -608,7 +608,13 @@ manpages = [
''], ''],
['sd_event_add_memory_pressure', ['sd_event_add_memory_pressure',
'3', '3',
['sd_event_source_set_memory_pressure_period', ['sd_event_add_cpu_pressure',
'sd_event_add_io_pressure',
'sd_event_source_set_cpu_pressure_period',
'sd_event_source_set_cpu_pressure_type',
'sd_event_source_set_io_pressure_period',
'sd_event_source_set_io_pressure_type',
'sd_event_source_set_memory_pressure_period',
'sd_event_source_set_memory_pressure_type', 'sd_event_source_set_memory_pressure_type',
'sd_event_trim_memory'], 'sd_event_trim_memory'],
''], ''],

View File

@ -21,7 +21,15 @@
<refname>sd_event_source_set_memory_pressure_period</refname> <refname>sd_event_source_set_memory_pressure_period</refname>
<refname>sd_event_trim_memory</refname> <refname>sd_event_trim_memory</refname>
<refpurpose>Add and configure an event source run as result of memory pressure</refpurpose> <refname>sd_event_add_cpu_pressure</refname>
<refname>sd_event_source_set_cpu_pressure_type</refname>
<refname>sd_event_source_set_cpu_pressure_period</refname>
<refname>sd_event_add_io_pressure</refname>
<refname>sd_event_source_set_io_pressure_type</refname>
<refname>sd_event_source_set_io_pressure_period</refname>
<refpurpose>Add and configure an event source for memory, CPU, or IO pressure notifications</refpurpose>
</refnamediv> </refnamediv>
<refsynopsisdiv> <refsynopsisdiv>
@ -51,6 +59,48 @@
<paramdef>uint64_t <parameter>window_usec</parameter></paramdef> <paramdef>uint64_t <parameter>window_usec</parameter></paramdef>
</funcprototype> </funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_add_cpu_pressure</function></funcdef>
<paramdef>sd_event *<parameter>event</parameter></paramdef>
<paramdef>sd_event_source **<parameter>ret_source</parameter></paramdef>
<paramdef>sd_event_handler_t <parameter>handler</parameter></paramdef>
<paramdef>void *<parameter>userdata</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_cpu_pressure_type</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>const char *<parameter>type</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_cpu_pressure_period</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>uint64_t <parameter>threshold_usec</parameter></paramdef>
<paramdef>uint64_t <parameter>window_usec</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_add_io_pressure</function></funcdef>
<paramdef>sd_event *<parameter>event</parameter></paramdef>
<paramdef>sd_event_source **<parameter>ret_source</parameter></paramdef>
<paramdef>sd_event_handler_t <parameter>handler</parameter></paramdef>
<paramdef>void *<parameter>userdata</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_io_pressure_type</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>const char *<parameter>type</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_io_pressure_period</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>uint64_t <parameter>threshold_usec</parameter></paramdef>
<paramdef>uint64_t <parameter>window_usec</parameter></paramdef>
</funcprototype>
<funcprototype> <funcprototype>
<funcdef>int <function>sd_event_trim_memory</function></funcdef> <funcdef>int <function>sd_event_trim_memory</function></funcdef>
<paramdef>void</paramdef> <paramdef>void</paramdef>
@ -62,18 +112,25 @@
<title>Description</title> <title>Description</title>
<para><function>sd_event_add_memory_pressure()</function> adds a new event source that is triggered <para><function>sd_event_add_memory_pressure()</function> adds a new event source that is triggered
whenever memory pressure is seen. This functionality is built around the Linux kernel's <ulink whenever memory pressure is seen. Similarly,
<function>sd_event_add_cpu_pressure()</function> and <function>sd_event_add_io_pressure()</function> add
new event sources that are triggered whenever CPU or IO pressure is seen, respectively. This functionality
is built around the Linux kernel's <ulink
url="https://docs.kernel.org/accounting/psi.html">Pressure Stall Information (PSI)</ulink> logic.</para> url="https://docs.kernel.org/accounting/psi.html">Pressure Stall Information (PSI)</ulink> logic.</para>
<para>Expects an event loop object as first parameter, and returns the allocated event source object in <para>These functions expect an event loop object as first parameter, and return the allocated event source
the second parameter, on success. The <parameter>handler</parameter> parameter is a function to call when object in the second parameter, on success. The <parameter>handler</parameter> parameter is a function to
memory pressure is seen, or <constant>NULL</constant>. The handler function will be passed the call when pressure is seen, or <constant>NULL</constant>. The handler function will be passed the
<parameter>userdata</parameter> pointer, which may be chosen freely by the caller. The handler may return <parameter>userdata</parameter> pointer, which may be chosen freely by the caller. The handler may return
negative to signal an error (see below), other return values are ignored. If negative to signal an error (see below), other return values are ignored. If
<parameter>handler</parameter> is <constant>NULL</constant>, a default handler that compacts allocation <parameter>handler</parameter> is <constant>NULL</constant>, a default handler is used. For
caches maintained by <filename>libsystemd</filename> as well as glibc (via <citerefentry <function>sd_event_add_memory_pressure()</function>, the default handler compacts allocation caches
project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>) maintained by <filename>libsystemd</filename> as well as glibc (via <citerefentry
will be used.</para> project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>).
For <function>sd_event_add_cpu_pressure()</function> and
<function>sd_event_add_io_pressure()</function>, the default handler is a no-op. It is recommended to
pass a custom handler for CPU and IO pressure to take meaningful action when pressure is
detected.</para>
<para>To destroy an event source object use <para>To destroy an event source object use
<citerefentry><refentrytitle>sd_event_source_unref</refentrytitle><manvolnum>3</manvolnum></citerefentry>, <citerefentry><refentrytitle>sd_event_source_unref</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
@ -83,12 +140,13 @@
<citerefentry><refentrytitle>sd_event_source_set_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry> <citerefentry><refentrytitle>sd_event_source_set_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>
with <constant>SD_EVENT_OFF</constant>.</para> with <constant>SD_EVENT_OFF</constant>.</para>
<para>If the second parameter of <function>sd_event_add_memory_pressure()</function> is <para>If the second parameter of <function>sd_event_add_memory_pressure()</function>,
<function>sd_event_add_cpu_pressure()</function>, or <function>sd_event_add_io_pressure()</function> is
<constant>NULL</constant> no reference to the event source object is returned. In this case, the event <constant>NULL</constant> no reference to the event source object is returned. In this case, the event
source is considered "floating", and will be destroyed implicitly when the event loop itself is source is considered "floating", and will be destroyed implicitly when the event loop itself is
destroyed.</para> destroyed.</para>
<para>The event source will fire according to the following logic:</para> <para>The memory pressure event source will fire according to the following logic:</para>
<orderedlist> <orderedlist>
<listitem><para>If the <listitem><para>If the
@ -111,6 +169,18 @@
<filename>/proc/pressure/memory</filename> is watched instead.</para></listitem> <filename>/proc/pressure/memory</filename> is watched instead.</para></listitem>
</orderedlist> </orderedlist>
<para>The CPU pressure event source follows the same logic, but uses the
<varname>$CPU_PRESSURE_WATCH</varname>/<varname>$CPU_PRESSURE_WRITE</varname> environment variables,
the <filename>cpu.pressure</filename> cgroup file, and the system-wide PSI interface file
<filename>/proc/pressure/cpu</filename> instead. Note that <filename>/proc/pressure/cpu</filename> only
provides the <literal>some</literal> line, not the <literal>full</literal> line, so only
<literal>some</literal> is valid when watching at the system level.</para>
<para>The IO pressure event source follows the same logic, but uses the
<varname>$IO_PRESSURE_WATCH</varname>/<varname>$IO_PRESSURE_WRITE</varname> environment variables,
the <filename>io.pressure</filename> cgroup file, and the system-wide PSI interface file
<filename>/proc/pressure/io</filename> instead.</para>
<para>Or in other words: preferably any explicit configuration passed in by an invoking service manager <para>Or in other words: preferably any explicit configuration passed in by an invoking service manager
(or similar) is used as notification source, before falling back to local notifications of the service, (or similar) is used as notification source, before falling back to local notifications of the service,
and finally to global notifications of the system.</para> and finally to global notifications of the system.</para>
@ -143,7 +213,7 @@
<para>The <function>sd_event_source_set_memory_pressure_type()</function> and <para>The <function>sd_event_source_set_memory_pressure_type()</function> and
<function>sd_event_source_set_memory_pressure_period()</function> functions can be used to fine-tune the <function>sd_event_source_set_memory_pressure_period()</function> functions can be used to fine-tune the
PSI parameters for pressure notifications. The former takes either <literal>some</literal>, PSI parameters for memory pressure notifications. The former takes either <literal>some</literal>,
<literal>full</literal> as second parameter, the latter takes threshold and period times in microseconds <literal>full</literal> as second parameter, the latter takes threshold and period times in microseconds
as parameters. For details about these three parameters see the PSI documentation. Note that these two as parameters. For details about these three parameters see the PSI documentation. Note that these two
calls must be invoked immediately after allocating the event source, as they must be configured before calls must be invoked immediately after allocating the event source, as they must be configured before
@ -152,6 +222,19 @@
environment variables (or in other words: configuration supplied by a service manager wins over internal environment variables (or in other words: configuration supplied by a service manager wins over internal
settings).</para> settings).</para>
<para>Similarly, <function>sd_event_source_set_cpu_pressure_type()</function> and
<function>sd_event_source_set_cpu_pressure_period()</function> can be used to fine-tune the PSI
parameters for CPU pressure notifications, and
<function>sd_event_source_set_io_pressure_type()</function> and
<function>sd_event_source_set_io_pressure_period()</function> can be used to fine-tune the PSI
parameters for IO pressure notifications. They work identically to their memory pressure counterparts.
The type parameter takes either <literal>some</literal> or <literal>full</literal>, and the period
function takes threshold and period times in microseconds. The same constraints apply: these calls must
be invoked immediately after allocating the event source, and will fail if pressure parameterization
has been passed in via the corresponding
<varname>$*_PRESSURE_WATCH</varname>/<varname>$*_PRESSURE_WRITE</varname> environment
variables.</para>
<para>The <function>sd_event_trim_memory()</function> function releases various internal allocation <para>The <function>sd_event_trim_memory()</function> function releases various internal allocation
caches maintained by <filename>libsystemd</filename> and then invokes glibc's <citerefentry caches maintained by <filename>libsystemd</filename> and then invokes glibc's <citerefentry
project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>. This project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>. This
@ -161,7 +244,7 @@
<constant>LOG_DEBUG</constant> level (with message ID f9b0be465ad540d0850ad32172d57c21) about the memory <constant>LOG_DEBUG</constant> level (with message ID f9b0be465ad540d0850ad32172d57c21) about the memory
pressure operation.</para> pressure operation.</para>
<para>For further details see <ulink url="https://systemd.io/MEMORY_PRESSURE">Memory Pressure Handling in <para>For further details see <ulink url="https://systemd.io/PRESSURE">Resource Pressure Handling in
systemd</ulink>.</para> systemd</ulink>.</para>
</refsect1> </refsect1>
@ -197,8 +280,9 @@
<varlistentry> <varlistentry>
<term><constant>-EHOSTDOWN</constant></term> <term><constant>-EHOSTDOWN</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable has been set to the literal <listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname>,
string <filename>/dev/null</filename>, in order to explicitly disable memory pressure <varname>$CPU_PRESSURE_WATCH</varname>, or <varname>$IO_PRESSURE_WATCH</varname> variable has been
set to the literal string <filename>/dev/null</filename>, in order to explicitly disable pressure
handling.</para> handling.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
@ -207,8 +291,9 @@
<varlistentry> <varlistentry>
<term><constant>-EBADMSG</constant></term> <term><constant>-EBADMSG</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable has been set to an invalid <listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname>,
string, for example a relative rather than an absolute path.</para> <varname>$CPU_PRESSURE_WATCH</varname>, or <varname>$IO_PRESSURE_WATCH</varname> variable has been
set to an invalid string, for example a relative rather than an absolute path.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry> </varlistentry>
@ -216,8 +301,9 @@
<varlistentry> <varlistentry>
<term><constant>-ENOTTY</constant></term> <term><constant>-ENOTTY</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable points to a regular file <listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname>,
outside of the procfs or cgroupfs file systems.</para> <varname>$CPU_PRESSURE_WATCH</varname>, or <varname>$IO_PRESSURE_WATCH</varname> variable points
to a regular file outside of the procfs or cgroupfs file systems.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry> </varlistentry>
@ -225,7 +311,8 @@
<varlistentry> <varlistentry>
<term><constant>-EOPNOTSUPP</constant></term> <term><constant>-EOPNOTSUPP</constant></term>
<listitem><para>No configuration via <varname>$MEMORY_PRESSURE_WATCH</varname> has been specified <listitem><para>No configuration via <varname>$MEMORY_PRESSURE_WATCH</varname>,
<varname>$CPU_PRESSURE_WATCH</varname>, or <varname>$IO_PRESSURE_WATCH</varname> has been specified
and the local kernel does not support the PSI interface.</para> and the local kernel does not support the PSI interface.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
@ -234,8 +321,12 @@
<varlistentry> <varlistentry>
<term><constant>-EBUSY</constant></term> <term><constant>-EBUSY</constant></term>
<listitem><para>This is returned by <function>sd_event_source_set_memory_pressure_type()</function> <listitem><para>This is returned by <function>sd_event_source_set_memory_pressure_type()</function>,
and <function>sd_event_source_set_memory_pressure_period()</function> if invoked on event sources <function>sd_event_source_set_memory_pressure_period()</function>,
<function>sd_event_source_set_cpu_pressure_type()</function>,
<function>sd_event_source_set_cpu_pressure_period()</function>,
<function>sd_event_source_set_io_pressure_type()</function>,
and <function>sd_event_source_set_io_pressure_period()</function> if invoked on event sources
at a time later than immediately after allocating them.</para> at a time later than immediately after allocating them.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
@ -277,6 +368,12 @@
<function>sd_event_source_set_memory_pressure_type()</function>, <function>sd_event_source_set_memory_pressure_type()</function>,
<function>sd_event_source_set_memory_pressure_period()</function>, and <function>sd_event_source_set_memory_pressure_period()</function>, and
<function>sd_event_trim_memory()</function> were added in version 254.</para> <function>sd_event_trim_memory()</function> were added in version 254.</para>
<para><function>sd_event_add_cpu_pressure()</function>,
<function>sd_event_source_set_cpu_pressure_type()</function>,
<function>sd_event_source_set_cpu_pressure_period()</function>,
<function>sd_event_add_io_pressure()</function>,
<function>sd_event_source_set_io_pressure_type()</function>, and
<function>sd_event_source_set_io_pressure_period()</function> were added in version 261.</para>
</refsect1> </refsect1>
<refsect1> <refsect1>

View File

@ -326,6 +326,34 @@
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><varname>DefaultCPUPressureWatch=</varname></term>
<term><varname>DefaultCPUPressureThresholdSec=</varname></term>
<listitem><para>Configures the default settings for the per-unit
<varname>CPUPressureWatch=</varname> and <varname>CPUPressureThresholdSec=</varname>
settings. See
<citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for details. Defaults to <literal>auto</literal> and <literal>200ms</literal>, respectively. This
also sets the CPU pressure monitoring threshold for the service manager itself.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry>
<term><varname>DefaultIOPressureWatch=</varname></term>
<term><varname>DefaultIOPressureThresholdSec=</varname></term>
<listitem><para>Configures the default settings for the per-unit
<varname>IOPressureWatch=</varname> and <varname>IOPressureThresholdSec=</varname>
settings. See
<citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for details. Defaults to <literal>auto</literal> and <literal>200ms</literal>, respectively. This
also sets the IO pressure monitoring threshold for the service manager itself.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
</variablelist> </variablelist>
</refsect1> </refsect1>

View File

@ -4698,13 +4698,37 @@ StandardInputData=V2XigLJyZSBubyBzdHJhbmdlcnMgdG8gbG92ZQpZb3Uga25vdyB0aGUgcnVsZX
<term><varname>$MEMORY_PRESSURE_WRITE</varname></term> <term><varname>$MEMORY_PRESSURE_WRITE</varname></term>
<listitem><para>If memory pressure monitoring is enabled for this service unit, the path to watch <listitem><para>If memory pressure monitoring is enabled for this service unit, the path to watch
and the data to write into it. See <ulink url="https://systemd.io/MEMORY_PRESSURE">Memory Pressure and the data to write into it. See <ulink url="https://systemd.io/PRESSURE">Resource Pressure
Handling</ulink> for details about these variables and the service protocol data they Handling</ulink> for details about these variables and the service protocol data they
convey.</para> convey.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><varname>$CPU_PRESSURE_WATCH</varname></term>
<term><varname>$CPU_PRESSURE_WRITE</varname></term>
<listitem><para>If CPU pressure monitoring is enabled for this service unit, the path to watch
and the data to write into it. See <ulink url="https://systemd.io/PRESSURE">Resource Pressure
Handling</ulink> for details about these variables and the service protocol data they
convey.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry>
<term><varname>$IO_PRESSURE_WATCH</varname></term>
<term><varname>$IO_PRESSURE_WRITE</varname></term>
<listitem><para>If IO pressure monitoring is enabled for this service unit, the path to watch
and the data to write into it. See <ulink url="https://systemd.io/PRESSURE">Resource Pressure
Handling</ulink> for details about these variables and the service protocol data they
convey.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term><varname>$FDSTORE</varname></term> <term><varname>$FDSTORE</varname></term>

View File

@ -1628,7 +1628,7 @@ DeviceAllow=/dev/loop-control
<para>Note that services are free to use the two environment variables, but it is unproblematic if <para>Note that services are free to use the two environment variables, but it is unproblematic if
they ignore them. Memory pressure handling must be implemented individually in each service, and they ignore them. Memory pressure handling must be implemented individually in each service, and
usually means different things for different software. For further details on memory pressure usually means different things for different software. For further details on memory pressure
handling see <ulink url="https://systemd.io/MEMORY_PRESSURE">Memory Pressure Handling in handling see <ulink url="https://systemd.io/PRESSURE">Resource Pressure Handling in
systemd</ulink>.</para> systemd</ulink>.</para>
<para>Services implemented using <para>Services implemented using
@ -1657,6 +1657,104 @@ DeviceAllow=/dev/loop-control
<xi:include href="version-info.xml" xpointer="v254"/></listitem> <xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><varname>CPUPressureWatch=</varname></term>
<listitem><para>Controls CPU pressure monitoring for invoked processes. Takes a boolean or one of
<literal>auto</literal> and <literal>skip</literal>. If <literal>no</literal>, tells the service not
to watch for CPU pressure events, by setting the <varname>$CPU_PRESSURE_WATCH</varname>
environment variable to the literal string <filename>/dev/null</filename>. If <literal>yes</literal>,
tells the service to watch for CPU pressure events. This ensures the
<filename>cpu.pressure</filename> cgroup attribute file is accessible for
reading and writing by the service's user. It then sets the <varname>$CPU_PRESSURE_WATCH</varname>
environment variable for processes invoked by the unit to the file system path to this file. The
threshold information configured with <varname>CPUPressureThresholdSec=</varname> is encoded in
the <varname>$CPU_PRESSURE_WRITE</varname> environment variable. If the <literal>auto</literal>
value is set the protocol is enabled if CPU resource controls are configured for the unit (e.g. because
<varname>CPUWeight=</varname> or <varname>CPUQuota=</varname> is set), and
disabled otherwise. If set to <literal>skip</literal> the logic is neither enabled, nor disabled and
the two environment variables are not set.</para>
<para>Note that services are free to use the two environment variables, but it is unproblematic if
they ignore them. CPU pressure handling must be implemented individually in each service, and
usually means different things for different software.</para>
<para>Services implemented using
<citerefentry><refentrytitle>sd-event</refentrytitle><manvolnum>3</manvolnum></citerefentry> may use
<citerefentry><refentrytitle>sd_event_add_cpu_pressure</refentrytitle><manvolnum>3</manvolnum></citerefentry>
to watch for and handle CPU pressure events.</para>
<para>If not explicitly set, defaults to the <varname>DefaultCPUPressureWatch=</varname> setting in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry>
<term><varname>CPUPressureThresholdSec=</varname></term>
<listitem><para>Sets the CPU pressure threshold time for CPU pressure monitor as configured via
<varname>CPUPressureWatch=</varname>. Specifies the maximum CPU stall time before a CPU
pressure event is signalled to the service, per 2s window. If not specified, defaults to the
<varname>DefaultCPUPressureThresholdSec=</varname> setting in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
(which in turn defaults to 200ms). The specified value expects a time unit such as
<literal>ms</literal> or <literal>μs</literal>, see
<citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for
details on the permitted syntax.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry>
<term><varname>IOPressureWatch=</varname></term>
<listitem><para>Controls IO pressure monitoring for invoked processes. Takes a boolean or one of
<literal>auto</literal> and <literal>skip</literal>. If <literal>no</literal>, tells the service not
to watch for IO pressure events, by setting the <varname>$IO_PRESSURE_WATCH</varname>
environment variable to the literal string <filename>/dev/null</filename>. If <literal>yes</literal>,
tells the service to watch for IO pressure events. This enables IO accounting for the
service, and ensures the <filename>io.pressure</filename> cgroup attribute file is accessible for
reading and writing by the service's user. It then sets the <varname>$IO_PRESSURE_WATCH</varname>
environment variable for processes invoked by the unit to the file system path to this file. The
threshold information configured with <varname>IOPressureThresholdSec=</varname> is encoded in
the <varname>$IO_PRESSURE_WRITE</varname> environment variable. If the <literal>auto</literal>
value is set the protocol is enabled if IO accounting is anyway enabled for the unit (e.g. because
<varname>IOWeight=</varname> or <varname>IODeviceWeight=</varname> is set), and
disabled otherwise. If set to <literal>skip</literal> the logic is neither enabled, nor disabled and
the two environment variables are not set.</para>
<para>Note that services are free to use the two environment variables, but it is unproblematic if
they ignore them. IO pressure handling must be implemented individually in each service, and
usually means different things for different software.</para>
<para>Services implemented using
<citerefentry><refentrytitle>sd-event</refentrytitle><manvolnum>3</manvolnum></citerefentry> may use
<citerefentry><refentrytitle>sd_event_add_io_pressure</refentrytitle><manvolnum>3</manvolnum></citerefentry>
to watch for and handle IO pressure events.</para>
<para>If not explicitly set, defaults to the <varname>DefaultIOPressureWatch=</varname> setting in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
<varlistentry>
<term><varname>IOPressureThresholdSec=</varname></term>
<listitem><para>Sets the IO pressure threshold time for IO pressure monitor as configured via
<varname>IOPressureWatch=</varname>. Specifies the maximum IO stall time before an IO
pressure event is signalled to the service, per 2s window. If not specified, defaults to the
<varname>DefaultIOPressureThresholdSec=</varname> setting in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
(which in turn defaults to 200ms). The specified value expects a time unit such as
<literal>ms</literal> or <literal>μs</literal>, see
<citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for
details on the permitted syntax.</para>
<xi:include href="version-info.xml" xpointer="v261"/></listitem>
</varlistentry>
</variablelist> </variablelist>
</refsect2><refsect2><title>Coredump Control</title> </refsect2><refsect2><title>Coredump Control</title>

View File

@ -1373,6 +1373,7 @@ conf.set10('HAVE_DWFL_SET_SYSROOT',
libz = dependency('zlib', libz = dependency('zlib',
required : get_option('zlib')) required : get_option('zlib'))
conf.set10('HAVE_ZLIB', libz.found()) conf.set10('HAVE_ZLIB', libz.found())
libz_cflags = libz.partial_dependency(includes: true, compile_args: true)
feature = get_option('bzip2') feature = get_option('bzip2')
libbzip2 = dependency('bzip2', libbzip2 = dependency('bzip2',
@ -1382,6 +1383,7 @@ if not libbzip2.found()
libbzip2 = cc.find_library('bz2', required : feature) libbzip2 = cc.find_library('bz2', required : feature)
endif endif
conf.set10('HAVE_BZIP2', libbzip2.found()) conf.set10('HAVE_BZIP2', libbzip2.found())
libbzip2_cflags = libbzip2.partial_dependency(includes: true, compile_args: true)
libxz = dependency('liblzma', libxz = dependency('liblzma',
required : get_option('xz')) required : get_option('xz'))

File diff suppressed because it is too large Load Diff

View File

@ -8,103 +8,81 @@ typedef enum Compression {
COMPRESSION_XZ, COMPRESSION_XZ,
COMPRESSION_LZ4, COMPRESSION_LZ4,
COMPRESSION_ZSTD, COMPRESSION_ZSTD,
COMPRESSION_GZIP,
COMPRESSION_BZIP2,
_COMPRESSION_MAX, _COMPRESSION_MAX,
_COMPRESSION_INVALID = -EINVAL, _COMPRESSION_INVALID = -EINVAL,
} Compression; } Compression;
DECLARE_STRING_TABLE_LOOKUP(compression, Compression); DECLARE_STRING_TABLE_LOOKUP(compression, Compression);
DECLARE_STRING_TABLE_LOOKUP(compression_lowercase, Compression); DECLARE_STRING_TABLE_LOOKUP(compression_uppercase, Compression);
DECLARE_STRING_TABLE_LOOKUP(compression_extension, Compression);
/* Try the lowercase string table first, fall back to the uppercase one. Useful for parsing user input
* where both forms (e.g. "xz" and "XZ") have historically been accepted. */
Compression compression_from_string_harder(const char *s);
/* Derives the compression type from a filename's extension, defaulting to COMPRESSION_NONE if the
* filename does not carry a recognized compression suffix. */
Compression compression_from_filename(const char *filename);
bool compression_supported(Compression c); bool compression_supported(Compression c);
int compress_blob_xz(const void *src, uint64_t src_size, /* Buffer size used by streaming compression APIs and pipeline stages that feed into them. Sized to
void *dst, size_t dst_alloc_size, size_t *dst_size, int level); * match the typical Linux pipe buffer so that pipeline stages don't lose throughput due to small
int compress_blob_lz4(const void *src, uint64_t src_size, * intermediate buffers. */
void *dst, size_t dst_alloc_size, size_t *dst_size, int level); #define COMPRESS_PIPE_BUFFER_SIZE (128U*1024U)
int compress_blob_zstd(const void *src, uint64_t src_size,
void *dst, size_t dst_alloc_size, size_t *dst_size, int level);
int decompress_blob_xz(const void *src, uint64_t src_size, /* Compressor / Decompressor — opaque push-based streaming compression context */
void **dst, size_t* dst_size, size_t dst_max);
int decompress_blob_lz4(const void *src, uint64_t src_size, typedef struct Compressor Compressor;
void **dst, size_t* dst_size, size_t dst_max); typedef Compressor Decompressor;
int decompress_blob_zstd(const void *src, uint64_t src_size,
void **dst, size_t* dst_size, size_t dst_max); typedef int (*DecompressorCallback)(const void *data, size_t size, void *userdata);
Compressor* compressor_free(Compressor *c);
DEFINE_TRIVIAL_CLEANUP_FUNC(Compressor*, compressor_free);
int compressor_new(Compressor **ret, Compression type);
int compressor_start(Compressor *c, const void *data, size_t size, void **buffer, size_t *buffer_size, size_t *buffer_allocated);
int compressor_finish(Compressor *c, void **buffer, size_t *buffer_size, size_t *buffer_allocated);
int decompressor_detect(Decompressor **ret, const void *data, size_t size);
int decompressor_force_off(Decompressor **ret);
int decompressor_push(Decompressor *c, const void *data, size_t size, DecompressorCallback callback, void *userdata);
Compression compressor_type(const Compressor *c);
/* Blob compression/decompression */
int compress_blob(Compression compression,
const void *src, uint64_t src_size,
void *dst, size_t dst_alloc_size, size_t *dst_size, int level);
int decompress_blob(Compression compression, int decompress_blob(Compression compression,
const void *src, uint64_t src_size, const void *src, uint64_t src_size,
void **dst, size_t* dst_size, size_t dst_max); void **dst, size_t *dst_size, size_t dst_max);
int decompress_zlib_raw(const void *src, uint64_t src_size,
void *dst, size_t dst_size, int wbits);
int decompress_startswith_xz(const void *src, uint64_t src_size,
void **buffer,
const void *prefix, size_t prefix_len,
uint8_t extra);
int decompress_startswith_lz4(const void *src, uint64_t src_size,
void **buffer,
const void *prefix, size_t prefix_len,
uint8_t extra);
int decompress_startswith_zstd(const void *src, uint64_t src_size,
void **buffer,
const void *prefix, size_t prefix_len,
uint8_t extra);
int decompress_startswith(Compression compression, int decompress_startswith(Compression compression,
const void *src, uint64_t src_size, const void *src, uint64_t src_size,
void **buffer, void **buffer,
const void *prefix, size_t prefix_len, const void *prefix, size_t prefix_len,
uint8_t extra); uint8_t extra);
int compress_stream_xz(int fdf, int fdt, uint64_t max_bytes, uint64_t *ret_uncompressed_size); /* Stream compression/decompression (fd-to-fd) */
int compress_stream_lz4(int fdf, int fdt, uint64_t max_bytes, uint64_t *ret_uncompressed_size);
int compress_stream_zstd(int fdf, int fdt, uint64_t max_bytes, uint64_t *ret_uncompressed_size);
int decompress_stream_xz(int fdf, int fdt, uint64_t max_bytes); int compress_stream(Compression type, int fdf, int fdt, uint64_t max_bytes, uint64_t *ret_uncompressed_size);
int decompress_stream_lz4(int fdf, int fdt, uint64_t max_bytes); int decompress_stream(Compression type, int fdf, int fdt, uint64_t max_bytes);
int decompress_stream_zstd(int fdf, int fdt, uint64_t max_bytes); int decompress_stream_by_filename(const char *filename, int fdf, int fdt, uint64_t max_bytes);
int dlopen_xz(void);
int dlopen_lz4(void); int dlopen_lz4(void);
int dlopen_zstd(void); int dlopen_zstd(void);
int dlopen_lzma(void); int dlopen_zlib(void);
int dlopen_bzip2(void);
static inline int compress_blob(
Compression compression,
const void *src, uint64_t src_size,
void *dst, size_t dst_alloc_size, size_t *dst_size, int level) {
switch (compression) {
case COMPRESSION_ZSTD:
return compress_blob_zstd(src, src_size, dst, dst_alloc_size, dst_size, level);
case COMPRESSION_LZ4:
return compress_blob_lz4(src, src_size, dst, dst_alloc_size, dst_size, level);
case COMPRESSION_XZ:
return compress_blob_xz(src, src_size, dst, dst_alloc_size, dst_size, level);
default:
return -EOPNOTSUPP;
}
}
static inline int compress_stream(int fdf, int fdt, uint64_t max_bytes, uint64_t *ret_uncompressed_size) {
switch (DEFAULT_COMPRESSION) {
case COMPRESSION_ZSTD:
return compress_stream_zstd(fdf, fdt, max_bytes, ret_uncompressed_size);
case COMPRESSION_LZ4:
return compress_stream_lz4(fdf, fdt, max_bytes, ret_uncompressed_size);
case COMPRESSION_XZ:
return compress_stream_xz(fdf, fdt, max_bytes, ret_uncompressed_size);
default:
return -EOPNOTSUPP;
}
}
static inline const char* default_compression_extension(void) { static inline const char* default_compression_extension(void) {
switch (DEFAULT_COMPRESSION) { return compression_extension_to_string(DEFAULT_COMPRESSION) ?: "";
case COMPRESSION_ZSTD:
return ".zst";
case COMPRESSION_LZ4:
return ".lz4";
case COMPRESSION_XZ:
return ".xz";
default:
return "";
}
} }
int decompress_stream(const char *filename, int fdf, int fdt, uint64_t max_bytes);

View File

@ -211,12 +211,14 @@ libbasic_static = static_library(
fundamental_sources, fundamental_sources,
include_directories : basic_includes, include_directories : basic_includes,
implicit_include_directories : false, implicit_include_directories : false,
dependencies : [libdl, dependencies : [libbzip2_cflags,
libdl,
libgcrypt_cflags, libgcrypt_cflags,
liblz4_cflags, liblz4_cflags,
libm, libm,
librt, librt,
libxz_cflags, libxz_cflags,
libz_cflags,
libzstd_cflags, libzstd_cflags,
threads, threads,
userspace], userspace],

View File

@ -10,6 +10,7 @@
#include "fileio.h" #include "fileio.h"
#include "parse-util.h" #include "parse-util.h"
#include "psi-util.h" #include "psi-util.h"
#include "string-table.h"
#include "string-util.h" #include "string-util.h"
#include "strv.h" #include "strv.h"
@ -104,6 +105,32 @@ int read_resource_pressure(const char *path, PressureType type, ResourcePressure
return 0; return 0;
} }
const PressureResourceInfo pressure_resource_info[_PRESSURE_RESOURCE_MAX] = {
[PRESSURE_MEMORY] = {
.name = "memory",
.env_watch = "MEMORY_PRESSURE_WATCH",
.env_write = "MEMORY_PRESSURE_WRITE",
},
[PRESSURE_CPU] = {
.name = "cpu",
.env_watch = "CPU_PRESSURE_WATCH",
.env_write = "CPU_PRESSURE_WRITE",
},
[PRESSURE_IO] = {
.name = "io",
.env_watch = "IO_PRESSURE_WATCH",
.env_write = "IO_PRESSURE_WRITE",
},
};
static const char* const pressure_resource_table[_PRESSURE_RESOURCE_MAX] = {
[PRESSURE_MEMORY] = "memory",
[PRESSURE_CPU] = "cpu",
[PRESSURE_IO] = "io",
};
DEFINE_STRING_TABLE_LOOKUP(pressure_resource, PressureResource);
int is_pressure_supported(void) { int is_pressure_supported(void) {
static thread_local int cached = -1; static thread_local int cached = -1;
int r; int r;

View File

@ -9,6 +9,14 @@ typedef enum PressureType {
PRESSURE_TYPE_FULL, PRESSURE_TYPE_FULL,
} PressureType; } PressureType;
typedef enum PressureResource {
PRESSURE_MEMORY,
PRESSURE_CPU,
PRESSURE_IO,
_PRESSURE_RESOURCE_MAX,
_PRESSURE_RESOURCE_INVALID = -EINVAL,
} PressureResource;
/* Averages are stored in fixed-point with 11 bit fractions */ /* Averages are stored in fixed-point with 11 bit fractions */
typedef struct ResourcePressure { typedef struct ResourcePressure {
loadavg_t avg10; loadavg_t avg10;
@ -27,7 +35,23 @@ int read_resource_pressure(const char *path, PressureType type, ResourcePressure
/* Was the kernel compiled with CONFIG_PSI=y? 1 if yes, 0 if not, negative on error. */ /* Was the kernel compiled with CONFIG_PSI=y? 1 if yes, 0 if not, negative on error. */
int is_pressure_supported(void); int is_pressure_supported(void);
/* Default parameters for memory pressure watch logic in sd-event and PID 1 */ /* Metadata for each pressure resource type, for use in sd-event and PID 1 */
#define MEMORY_PRESSURE_DEFAULT_TYPE "some" typedef struct PressureResourceInfo {
#define MEMORY_PRESSURE_DEFAULT_THRESHOLD_USEC (200 * USEC_PER_MSEC) const char *name; /* "memory", "cpu", "io" */
#define MEMORY_PRESSURE_DEFAULT_WINDOW_USEC (2 * USEC_PER_SEC) const char *env_watch; /* "MEMORY_PRESSURE_WATCH", etc. */
const char *env_write; /* "MEMORY_PRESSURE_WRITE", etc. */
} PressureResourceInfo;
extern const PressureResourceInfo pressure_resource_info[_PRESSURE_RESOURCE_MAX];
static inline const PressureResourceInfo* pressure_resource_get_info(PressureResource resource) {
assert(resource >= 0 && resource < _PRESSURE_RESOURCE_MAX);
return &pressure_resource_info[resource];
}
DECLARE_STRING_TABLE_LOOKUP(pressure_resource, PressureResource);
/* Default parameters for pressure watch logic in sd-event and PID 1 */
#define PRESSURE_DEFAULT_TYPE "some"
#define PRESSURE_DEFAULT_THRESHOLD_USEC (200 * USEC_PER_MSEC)
#define PRESSURE_DEFAULT_WINDOW_USEC (2 * USEC_PER_SEC)

View File

@ -17,7 +17,7 @@ static void load_bcd(const char *path, void **ret_bcd, size_t *ret_bcd_len) {
assert_se(get_testdata_dir(path, &fn) >= 0); assert_se(get_testdata_dir(path, &fn) >= 0);
assert_se(read_full_file_full(AT_FDCWD, fn, UINT64_MAX, SIZE_MAX, 0, NULL, &compressed, &len) >= 0); assert_se(read_full_file_full(AT_FDCWD, fn, UINT64_MAX, SIZE_MAX, 0, NULL, &compressed, &len) >= 0);
assert_se(decompress_blob_zstd(compressed, len, ret_bcd, ret_bcd_len, SIZE_MAX) >= 0); assert_se(decompress_blob(COMPRESSION_ZSTD, compressed, len, ret_bcd, ret_bcd_len, SIZE_MAX) >= 0);
} }
static void test_get_bcd_title_one( static void test_get_bcd_title_one(

View File

@ -185,8 +185,11 @@ void cgroup_context_init(CGroupContext *c) {
* moom_mem_pressure_duration_usec is set to infinity. */ * moom_mem_pressure_duration_usec is set to infinity. */
.moom_mem_pressure_duration_usec = USEC_INFINITY, .moom_mem_pressure_duration_usec = USEC_INFINITY,
.memory_pressure_watch = _CGROUP_PRESSURE_WATCH_INVALID, .pressure = {
.memory_pressure_threshold_usec = USEC_INFINITY, [PRESSURE_MEMORY] = { .watch = _CGROUP_PRESSURE_WATCH_INVALID, .threshold_usec = USEC_INFINITY },
[PRESSURE_CPU] = { .watch = _CGROUP_PRESSURE_WATCH_INVALID, .threshold_usec = USEC_INFINITY },
[PRESSURE_IO] = { .watch = _CGROUP_PRESSURE_WATCH_INVALID, .threshold_usec = USEC_INFINITY },
},
}; };
} }
@ -528,6 +531,8 @@ void cgroup_context_dump(Unit *u, FILE* f, const char *prefix) {
"%sManagedOOMMemoryPressureLimit: " PERMYRIAD_AS_PERCENT_FORMAT_STR "\n" "%sManagedOOMMemoryPressureLimit: " PERMYRIAD_AS_PERCENT_FORMAT_STR "\n"
"%sManagedOOMPreference: %s\n" "%sManagedOOMPreference: %s\n"
"%sMemoryPressureWatch: %s\n" "%sMemoryPressureWatch: %s\n"
"%sCPUPressureWatch: %s\n"
"%sIOPressureWatch: %s\n"
"%sCoredumpReceive: %s\n", "%sCoredumpReceive: %s\n",
prefix, yes_no(c->io_accounting), prefix, yes_no(c->io_accounting),
prefix, yes_no(c->memory_accounting), prefix, yes_no(c->memory_accounting),
@ -563,7 +568,9 @@ void cgroup_context_dump(Unit *u, FILE* f, const char *prefix) {
prefix, managed_oom_mode_to_string(c->moom_mem_pressure), prefix, managed_oom_mode_to_string(c->moom_mem_pressure),
prefix, PERMYRIAD_AS_PERCENT_FORMAT_VAL(UINT32_SCALE_TO_PERMYRIAD(c->moom_mem_pressure_limit)), prefix, PERMYRIAD_AS_PERCENT_FORMAT_VAL(UINT32_SCALE_TO_PERMYRIAD(c->moom_mem_pressure_limit)),
prefix, managed_oom_preference_to_string(c->moom_preference), prefix, managed_oom_preference_to_string(c->moom_preference),
prefix, cgroup_pressure_watch_to_string(c->memory_pressure_watch), prefix, cgroup_pressure_watch_to_string(c->pressure[PRESSURE_MEMORY].watch),
prefix, cgroup_pressure_watch_to_string(c->pressure[PRESSURE_CPU].watch),
prefix, cgroup_pressure_watch_to_string(c->pressure[PRESSURE_IO].watch),
prefix, yes_no(c->coredump_receive)); prefix, yes_no(c->coredump_receive));
if (c->delegate_subgroup) if (c->delegate_subgroup)
@ -574,9 +581,17 @@ void cgroup_context_dump(Unit *u, FILE* f, const char *prefix) {
fprintf(f, "%sBindNetworkInterface: %s\n", fprintf(f, "%sBindNetworkInterface: %s\n",
prefix, c->bind_network_interface); prefix, c->bind_network_interface);
if (c->memory_pressure_threshold_usec != USEC_INFINITY) if (c->pressure[PRESSURE_MEMORY].threshold_usec != USEC_INFINITY)
fprintf(f, "%sMemoryPressureThresholdSec: %s\n", fprintf(f, "%sMemoryPressureThresholdSec: %s\n",
prefix, FORMAT_TIMESPAN(c->memory_pressure_threshold_usec, 1)); prefix, FORMAT_TIMESPAN(c->pressure[PRESSURE_MEMORY].threshold_usec, 1));
if (c->pressure[PRESSURE_CPU].threshold_usec != USEC_INFINITY)
fprintf(f, "%sCPUPressureThresholdSec: %s\n",
prefix, FORMAT_TIMESPAN(c->pressure[PRESSURE_CPU].threshold_usec, 1));
if (c->pressure[PRESSURE_IO].threshold_usec != USEC_INFINITY)
fprintf(f, "%sIOPressureThresholdSec: %s\n",
prefix, FORMAT_TIMESPAN(c->pressure[PRESSURE_IO].threshold_usec, 1));
if (c->moom_mem_pressure_duration_usec != USEC_INFINITY) if (c->moom_mem_pressure_duration_usec != USEC_INFINITY)
fprintf(f, "%sManagedOOMMemoryPressureDurationSec: %s\n", fprintf(f, "%sManagedOOMMemoryPressureDurationSec: %s\n",
@ -2107,12 +2122,13 @@ static int unit_update_cgroup(
cgroup_context_apply(u, target_mask, state); cgroup_context_apply(u, target_mask, state);
cgroup_xattr_apply(u); cgroup_xattr_apply(u);
/* For most units we expect that memory monitoring is set up before the unit is started and we won't /* For most units we expect that pressure monitoring is set up before the unit is started and we
* touch it after. For PID 1 this is different though, because we couldn't possibly do that given * won't touch it after. For PID 1 this is different though, because we couldn't possibly do that
* that PID 1 runs before init.scope is even set up. Hence, whenever init.scope is realized, let's * given that PID 1 runs before init.scope is even set up. Hence, whenever init.scope is realized,
* try to open the memory pressure interface anew. */ * let's try to open the pressure interfaces anew. */
if (unit_has_name(u, SPECIAL_INIT_SCOPE)) if (unit_has_name(u, SPECIAL_INIT_SCOPE))
(void) manager_setup_memory_pressure_event_source(u->manager); for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++)
(void) manager_setup_pressure_event_source(u->manager, t);
return 0; return 0;
} }

View File

@ -6,6 +6,7 @@
#include "cpu-set-util.h" #include "cpu-set-util.h"
#include "firewall-util.h" #include "firewall-util.h"
#include "list.h" #include "list.h"
#include "psi-util.h"
typedef struct CGroupTasksMax { typedef struct CGroupTasksMax {
/* If scale == 0, just use value; otherwise, value / scale. /* If scale == 0, just use value; otherwise, value / scale.
@ -95,14 +96,19 @@ typedef struct CGroupSocketBindItem {
} CGroupSocketBindItem; } CGroupSocketBindItem;
typedef enum CGroupPressureWatch { typedef enum CGroupPressureWatch {
CGROUP_PRESSURE_WATCH_NO, /* → tells the service payload explicitly not to watch for memory pressure */ CGROUP_PRESSURE_WATCH_NO, /* → tells the service payload explicitly not to watch for pressure */
CGROUP_PRESSURE_WATCH_YES, CGROUP_PRESSURE_WATCH_YES,
CGROUP_PRESSURE_WATCH_AUTO, /* → on if memory account is on anyway for the unit, otherwise off */ CGROUP_PRESSURE_WATCH_AUTO, /* → on if relevant accounting is on anyway for the unit, otherwise off */
CGROUP_PRESSURE_WATCH_SKIP, /* → doesn't set up memory pressure watch, but also doesn't explicitly tell payload to avoid it */ CGROUP_PRESSURE_WATCH_SKIP, /* → doesn't set up pressure watch, but also doesn't explicitly tell payload to avoid it */
_CGROUP_PRESSURE_WATCH_MAX, _CGROUP_PRESSURE_WATCH_MAX,
_CGROUP_PRESSURE_WATCH_INVALID = -EINVAL, _CGROUP_PRESSURE_WATCH_INVALID = -EINVAL,
} CGroupPressureWatch; } CGroupPressureWatch;
typedef struct CGroupPressure {
CGroupPressureWatch watch;
usec_t threshold_usec;
} CGroupPressure;
/* The user-supplied cgroup-related configuration options. This remains mostly immutable while the service /* The user-supplied cgroup-related configuration options. This remains mostly immutable while the service
* manager is running (except for an occasional SetProperties() configuration change), outside of reload * manager is running (except for an occasional SetProperties() configuration change), outside of reload
* cycles. */ * cycles. */
@ -189,11 +195,8 @@ typedef struct CGroupContext {
usec_t moom_mem_pressure_duration_usec; usec_t moom_mem_pressure_duration_usec;
ManagedOOMPreference moom_preference; ManagedOOMPreference moom_preference;
/* Memory pressure logic */ /* Pressure logic */
CGroupPressureWatch memory_pressure_watch; CGroupPressure pressure[_PRESSURE_RESOURCE_MAX];
usec_t memory_pressure_threshold_usec;
/* NB: For now we don't make the period configurable, not the type, nor do we allow multiple
* triggers, nor triggers for non-memory pressure. We might add that later. */
NFTSetContext nft_set_context; NFTSetContext nft_set_context;
@ -353,11 +356,37 @@ void cgroup_context_free_io_device_latency(CGroupContext *c, CGroupIODeviceLaten
void cgroup_context_remove_bpf_foreign_program(CGroupContext *c, CGroupBPFForeignProgram *p); void cgroup_context_remove_bpf_foreign_program(CGroupContext *c, CGroupBPFForeignProgram *p);
void cgroup_context_remove_socket_bind(CGroupSocketBindItem **head); void cgroup_context_remove_socket_bind(CGroupSocketBindItem **head);
static inline bool cgroup_context_want_memory_pressure(const CGroupContext *c) { static inline bool cgroup_context_want_pressure(const CGroupContext *c, PressureResource t) {
assert(c); assert(c);
assert(t >= 0 && t < _PRESSURE_RESOURCE_MAX);
return c->memory_pressure_watch == CGROUP_PRESSURE_WATCH_YES || if (c->pressure[t].watch == CGROUP_PRESSURE_WATCH_YES)
(c->memory_pressure_watch == CGROUP_PRESSURE_WATCH_AUTO && c->memory_accounting); return true;
if (c->pressure[t].watch != CGROUP_PRESSURE_WATCH_AUTO)
return false;
switch (t) {
case PRESSURE_MEMORY:
return c->memory_accounting;
case PRESSURE_CPU:
return c->cpu_weight != CGROUP_WEIGHT_INVALID ||
c->startup_cpu_weight != CGROUP_WEIGHT_INVALID ||
c->cpu_quota_per_sec_usec != USEC_INFINITY;
case PRESSURE_IO:
return c->io_accounting ||
c->io_weight != CGROUP_WEIGHT_INVALID ||
c->startup_io_weight != CGROUP_WEIGHT_INVALID ||
c->io_device_weights ||
c->io_device_latencies ||
c->io_device_limits;
default:
assert_not_reached();
}
} }
static inline bool cgroup_context_has_device_policy(const CGroupContext *c) { static inline bool cgroup_context_has_device_policy(const CGroupContext *c) {

View File

@ -427,8 +427,12 @@ const sd_bus_vtable bus_cgroup_vtable[] = {
SD_BUS_PROPERTY("SocketBindDeny", "a(iiqq)", property_get_socket_bind, offsetof(CGroupContext, socket_bind_deny), 0), SD_BUS_PROPERTY("SocketBindDeny", "a(iiqq)", property_get_socket_bind, offsetof(CGroupContext, socket_bind_deny), 0),
SD_BUS_PROPERTY("RestrictNetworkInterfaces", "(bas)", property_get_restrict_network_interfaces, 0, 0), SD_BUS_PROPERTY("RestrictNetworkInterfaces", "(bas)", property_get_restrict_network_interfaces, 0, 0),
SD_BUS_PROPERTY("BindNetworkInterface", "s", NULL, offsetof(CGroupContext, bind_network_interface), 0), SD_BUS_PROPERTY("BindNetworkInterface", "s", NULL, offsetof(CGroupContext, bind_network_interface), 0),
SD_BUS_PROPERTY("MemoryPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(CGroupContext, memory_pressure_watch), 0), SD_BUS_PROPERTY("MemoryPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(CGroupContext, pressure[PRESSURE_MEMORY].watch), 0),
SD_BUS_PROPERTY("MemoryPressureThresholdUSec", "t", bus_property_get_usec, offsetof(CGroupContext, memory_pressure_threshold_usec), 0), SD_BUS_PROPERTY("MemoryPressureThresholdUSec", "t", bus_property_get_usec, offsetof(CGroupContext, pressure[PRESSURE_MEMORY].threshold_usec), 0),
SD_BUS_PROPERTY("CPUPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(CGroupContext, pressure[PRESSURE_CPU].watch), 0),
SD_BUS_PROPERTY("CPUPressureThresholdUSec", "t", bus_property_get_usec, offsetof(CGroupContext, pressure[PRESSURE_CPU].threshold_usec), 0),
SD_BUS_PROPERTY("IOPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(CGroupContext, pressure[PRESSURE_IO].watch), 0),
SD_BUS_PROPERTY("IOPressureThresholdUSec", "t", bus_property_get_usec, offsetof(CGroupContext, pressure[PRESSURE_IO].threshold_usec), 0),
SD_BUS_PROPERTY("NFTSet", "a(iiss)", property_get_cgroup_nft_set, 0, 0), SD_BUS_PROPERTY("NFTSet", "a(iiss)", property_get_cgroup_nft_set, 0, 0),
SD_BUS_PROPERTY("CoredumpReceive", "b", bus_property_get_bool, offsetof(CGroupContext, coredump_receive), 0), SD_BUS_PROPERTY("CoredumpReceive", "b", bus_property_get_bool, offsetof(CGroupContext, coredump_receive), 0),
@ -712,10 +716,13 @@ static int bus_cgroup_set_transient_property(
return 1; return 1;
} else if (streq(name, "MemoryPressureWatch")) { } else if (STR_IN_SET(name, "MemoryPressureWatch", "CPUPressureWatch", "IOPressureWatch")) {
CGroupPressureWatch p; CGroupPressureWatch p;
const char *t; const char *t;
PressureResource pt = streq(name, "MemoryPressureWatch") ? PRESSURE_MEMORY :
streq(name, "CPUPressureWatch") ? PRESSURE_CPU : PRESSURE_IO;
r = sd_bus_message_read(message, "s", &t); r = sd_bus_message_read(message, "s", &t);
if (r < 0) if (r < 0)
return r; return r;
@ -729,26 +736,29 @@ static int bus_cgroup_set_transient_property(
} }
if (!UNIT_WRITE_FLAGS_NOOP(flags)) { if (!UNIT_WRITE_FLAGS_NOOP(flags)) {
c->memory_pressure_watch = p; c->pressure[pt].watch = p;
unit_write_settingf(u, flags, name, "MemoryPressureWatch=%s", strempty(cgroup_pressure_watch_to_string(p))); unit_write_settingf(u, flags, name, "%s=%s", name, strempty(cgroup_pressure_watch_to_string(p)));
} }
return 1; return 1;
} else if (streq(name, "MemoryPressureThresholdUSec")) { } else if (STR_IN_SET(name, "MemoryPressureThresholdUSec", "CPUPressureThresholdUSec", "IOPressureThresholdUSec")) {
uint64_t t; uint64_t t;
PressureResource pt = streq(name, "MemoryPressureThresholdUSec") ? PRESSURE_MEMORY :
streq(name, "CPUPressureThresholdUSec") ? PRESSURE_CPU : PRESSURE_IO;
r = sd_bus_message_read(message, "t", &t); r = sd_bus_message_read(message, "t", &t);
if (r < 0) if (r < 0)
return r; return r;
if (!UNIT_WRITE_FLAGS_NOOP(flags)) { if (!UNIT_WRITE_FLAGS_NOOP(flags)) {
c->memory_pressure_threshold_usec = t; c->pressure[pt].threshold_usec = t;
if (t == UINT64_MAX) if (t == UINT64_MAX)
unit_write_setting(u, flags, name, "MemoryPressureThresholdUSec="); unit_write_settingf(u, flags, name, "%s=", name);
else else
unit_write_settingf(u, flags, name, "MemoryPressureThresholdUSec=%" PRIu64, t); unit_write_settingf(u, flags, name, "%s=%" PRIu64, name, t);
} }
return 1; return 1;

View File

@ -2980,8 +2980,12 @@ const sd_bus_vtable bus_manager_vtable[] = {
SD_BUS_PROPERTY("DefaultLimitRTTIME", "t", bus_property_get_rlimit, offsetof(Manager, defaults.rlimit[RLIMIT_RTTIME]), SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("DefaultLimitRTTIME", "t", bus_property_get_rlimit, offsetof(Manager, defaults.rlimit[RLIMIT_RTTIME]), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("DefaultLimitRTTIMESoft", "t", bus_property_get_rlimit, offsetof(Manager, defaults.rlimit[RLIMIT_RTTIME]), SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("DefaultLimitRTTIMESoft", "t", bus_property_get_rlimit, offsetof(Manager, defaults.rlimit[RLIMIT_RTTIME]), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("DefaultTasksMax", "t", bus_property_get_tasks_max, offsetof(Manager, defaults.tasks_max), 0), SD_BUS_PROPERTY("DefaultTasksMax", "t", bus_property_get_tasks_max, offsetof(Manager, defaults.tasks_max), 0),
SD_BUS_PROPERTY("DefaultMemoryPressureThresholdUSec", "t", bus_property_get_usec, offsetof(Manager, defaults.memory_pressure_threshold_usec), 0), SD_BUS_PROPERTY("DefaultMemoryPressureThresholdUSec", "t", bus_property_get_usec, offsetof(Manager, defaults.pressure[PRESSURE_MEMORY].threshold_usec), 0),
SD_BUS_PROPERTY("DefaultMemoryPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(Manager, defaults.memory_pressure_watch), 0), SD_BUS_PROPERTY("DefaultMemoryPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(Manager, defaults.pressure[PRESSURE_MEMORY].watch), 0),
SD_BUS_PROPERTY("DefaultCPUPressureThresholdUSec", "t", bus_property_get_usec, offsetof(Manager, defaults.pressure[PRESSURE_CPU].threshold_usec), 0),
SD_BUS_PROPERTY("DefaultCPUPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(Manager, defaults.pressure[PRESSURE_CPU].watch), 0),
SD_BUS_PROPERTY("DefaultIOPressureThresholdUSec", "t", bus_property_get_usec, offsetof(Manager, defaults.pressure[PRESSURE_IO].threshold_usec), 0),
SD_BUS_PROPERTY("DefaultIOPressureWatch", "s", bus_property_get_cgroup_pressure_watch, offsetof(Manager, defaults.pressure[PRESSURE_IO].watch), 0),
SD_BUS_PROPERTY("TimerSlackNSec", "t", property_get_timer_slack_nsec, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("TimerSlackNSec", "t", property_get_timer_slack_nsec, 0, SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("DefaultOOMPolicy", "s", bus_property_get_oom_policy, offsetof(Manager, defaults.oom_policy), SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("DefaultOOMPolicy", "s", bus_property_get_oom_policy, offsetof(Manager, defaults.oom_policy), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("DefaultOOMScoreAdjust", "i", property_get_oom_score_adjust, 0, SD_BUS_VTABLE_PROPERTY_CONST), SD_BUS_PROPERTY("DefaultOOMScoreAdjust", "i", property_get_oom_score_adjust, 0, SD_BUS_VTABLE_PROPERTY_CONST),

View File

@ -2034,7 +2034,7 @@ static int build_environment(
const char *shell, const char *shell,
dev_t journal_stream_dev, dev_t journal_stream_dev,
ino_t journal_stream_ino, ino_t journal_stream_ino,
const char *memory_pressure_path, char *const *pressure_path,
bool needs_sandboxing, bool needs_sandboxing,
char ***ret) { char ***ret) {
@ -2215,25 +2215,38 @@ static int build_environment(
if (r < 0) if (r < 0)
return r; return r;
if (memory_pressure_path) { for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++) {
r = strv_extend_joined_with_size(&e, &n, "MEMORY_PRESSURE_WATCH=", memory_pressure_path); if (!pressure_path[t])
continue;
const PressureResourceInfo *info = pressure_resource_get_info(t);
_cleanup_free_ char *env_watch = strjoin(info->env_watch, "=");
if (!env_watch)
return -ENOMEM;
r = strv_extend_joined_with_size(&e, &n, env_watch, pressure_path[t]);
if (r < 0) if (r < 0)
return r; return r;
if (!path_equal(memory_pressure_path, "/dev/null")) { if (!path_equal(pressure_path[t], "/dev/null")) {
_cleanup_free_ char *b = NULL, *x = NULL; _cleanup_free_ char *b = NULL, *x = NULL;
if (asprintf(&b, "%s " USEC_FMT " " USEC_FMT, if (asprintf(&b, "%s " USEC_FMT " " USEC_FMT,
MEMORY_PRESSURE_DEFAULT_TYPE, PRESSURE_DEFAULT_TYPE,
cgroup_context->memory_pressure_threshold_usec == USEC_INFINITY ? MEMORY_PRESSURE_DEFAULT_THRESHOLD_USEC : cgroup_context->pressure[t].threshold_usec == USEC_INFINITY ? PRESSURE_DEFAULT_THRESHOLD_USEC :
CLAMP(cgroup_context->memory_pressure_threshold_usec, 1U, MEMORY_PRESSURE_DEFAULT_WINDOW_USEC), CLAMP(cgroup_context->pressure[t].threshold_usec, 1U, PRESSURE_DEFAULT_WINDOW_USEC),
MEMORY_PRESSURE_DEFAULT_WINDOW_USEC) < 0) PRESSURE_DEFAULT_WINDOW_USEC) < 0)
return -ENOMEM; return -ENOMEM;
if (base64mem(b, strlen(b) + 1, &x) < 0) if (base64mem(b, strlen(b) + 1, &x) < 0)
return -ENOMEM; return -ENOMEM;
r = strv_extend_joined_with_size(&e, &n, "MEMORY_PRESSURE_WRITE=", x); _cleanup_free_ char *env_write = strjoin(info->env_write, "=");
if (!env_write)
return -ENOMEM;
r = strv_extend_joined_with_size(&e, &n, env_write, x);
if (r < 0) if (r < 0)
return r; return r;
} }
@ -3855,7 +3868,7 @@ static int apply_mount_namespace(
const ExecParameters *params, const ExecParameters *params,
const ExecRuntime *runtime, const ExecRuntime *runtime,
const PinnedResource *rootfs, const PinnedResource *rootfs,
const char *memory_pressure_path, char *const *pressure_path,
bool needs_sandboxing, bool needs_sandboxing,
uid_t exec_directory_uid, uid_t exec_directory_uid,
gid_t exec_directory_gid, gid_t exec_directory_gid,
@ -3887,16 +3900,28 @@ static int apply_mount_namespace(
if (r < 0) if (r < 0)
return r; return r;
/* We need to make the pressure path writable even if /sys/fs/cgroups is made read-only, as the /* We need to make the pressure paths writable even if /sys/fs/cgroups is made read-only, as the
* service will need to write to it in order to start the notifications. */ * service will need to write to them in order to start the notifications. */
if (exec_is_cgroup_mount_read_only(context) && memory_pressure_path && !streq(memory_pressure_path, "/dev/null")) { bool need_pressure_rw = false;
for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++)
if (pressure_path[t] && !streq(pressure_path[t], "/dev/null")) {
need_pressure_rw = true;
break;
}
if (exec_is_cgroup_mount_read_only(context) && need_pressure_rw) {
read_write_paths_cleanup = strv_copy(context->read_write_paths); read_write_paths_cleanup = strv_copy(context->read_write_paths);
if (!read_write_paths_cleanup) if (!read_write_paths_cleanup)
return -ENOMEM; return -ENOMEM;
r = strv_extend(&read_write_paths_cleanup, memory_pressure_path); for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++) {
if (r < 0) if (!pressure_path[t] || streq(pressure_path[t], "/dev/null"))
return r; continue;
r = strv_extend(&read_write_paths_cleanup, pressure_path[t]);
if (r < 0)
return r;
}
read_write_paths = read_write_paths_cleanup; read_write_paths = read_write_paths_cleanup;
} else } else
@ -4689,7 +4714,7 @@ static int setup_delegated_namespaces(
const ExecRuntime *runtime, const ExecRuntime *runtime,
const PinnedResource *rootfs, const PinnedResource *rootfs,
bool delegate, bool delegate,
const char *memory_pressure_path, char *const *pressure_path,
uid_t uid, uid_t uid,
gid_t gid, gid_t gid,
const ExecCommand *command, const ExecCommand *command,
@ -4820,7 +4845,7 @@ static int setup_delegated_namespaces(
params, params,
runtime, runtime,
rootfs, rootfs,
memory_pressure_path, pressure_path,
needs_sandboxing, needs_sandboxing,
uid, uid,
gid, gid,
@ -5146,6 +5171,10 @@ static int setup_term_environment(const ExecContext *context, char ***env) {
return strv_env_replace_strdup(env, "TERM=" FALLBACK_TERM); return strv_env_replace_strdup(env, "TERM=" FALLBACK_TERM);
} }
static inline void free_pressure_paths(char *(*p)[_PRESSURE_RESOURCE_MAX]) {
free_many_charp(*p, _PRESSURE_RESOURCE_MAX);
}
int exec_invoke( int exec_invoke(
const ExecCommand *command, const ExecCommand *command,
const ExecContext *context, const ExecContext *context,
@ -5157,7 +5186,8 @@ int exec_invoke(
_cleanup_strv_free_ char **our_env = NULL, **pass_env = NULL, **joined_exec_search_path = NULL, **accum_env = NULL; _cleanup_strv_free_ char **our_env = NULL, **pass_env = NULL, **joined_exec_search_path = NULL, **accum_env = NULL;
int r; int r;
const char *username = NULL, *groupname = NULL; const char *username = NULL, *groupname = NULL;
_cleanup_free_ char *home_buffer = NULL, *memory_pressure_path = NULL, *own_user = NULL; _cleanup_free_ char *home_buffer = NULL, *own_user = NULL;
_cleanup_(free_pressure_paths) char *pressure_path[_PRESSURE_RESOURCE_MAX] = {};
const char *pwent_home = NULL, *shell = NULL; const char *pwent_home = NULL, *shell = NULL;
dev_t journal_stream_dev = 0; dev_t journal_stream_dev = 0;
ino_t journal_stream_ino = 0; ino_t journal_stream_ino = 0;
@ -5753,36 +5783,44 @@ int exec_invoke(
} }
if (is_pressure_supported() > 0) { if (is_pressure_supported() > 0) {
if (cgroup_context_want_memory_pressure(cgroup_context)) { for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++) {
r = cg_get_path(params->cgroup_path, "memory.pressure", &memory_pressure_path); if (cgroup_context_want_pressure(cgroup_context, t)) {
if (r < 0) { _cleanup_free_ char *pressure_file = strjoin(pressure_resource_to_string(t), ".pressure");
*exit_status = EXIT_MEMORY; if (!pressure_file) {
return log_oom(); *exit_status = EXIT_MEMORY;
} return log_oom();
}
r = chmod_and_chown(memory_pressure_path, 0644, uid, gid); r = cg_get_path(params->cgroup_path, pressure_file, &pressure_path[t]);
if (r < 0) {
log_full_errno(r == -ENOENT || ERRNO_IS_PRIVILEGE(r) ? LOG_DEBUG : LOG_WARNING, r,
"Failed to adjust ownership of '%s', ignoring: %m", memory_pressure_path);
memory_pressure_path = mfree(memory_pressure_path);
}
/* First we use the current cgroup path to chmod and chown the memory pressure path, then pass the path relative
* to the cgroup namespace to environment variables and mounts. If chown/chmod fails, we should not pass memory
* pressure path environment variable or read-write mount to the unit. This is why we check if
* memory_pressure_path != NULL in the conditional below. */
if (memory_pressure_path && needs_sandboxing && exec_needs_cgroup_namespace(context)) {
memory_pressure_path = mfree(memory_pressure_path);
r = cg_get_path("/", "memory.pressure", &memory_pressure_path);
if (r < 0) { if (r < 0) {
*exit_status = EXIT_MEMORY; *exit_status = EXIT_MEMORY;
return log_oom(); return log_oom();
} }
}
} else if (cgroup_context->memory_pressure_watch == CGROUP_PRESSURE_WATCH_NO) { r = chmod_and_chown(pressure_path[t], 0644, uid, gid);
memory_pressure_path = strdup("/dev/null"); /* /dev/null is explicit indicator for turning of memory pressure watch */ if (r < 0) {
if (!memory_pressure_path) { log_full_errno(r == -ENOENT || ERRNO_IS_PRIVILEGE(r) ? LOG_DEBUG : LOG_WARNING, r,
*exit_status = EXIT_MEMORY; "Failed to adjust ownership of '%s', ignoring: %m", pressure_path[t]);
return log_oom(); pressure_path[t] = mfree(pressure_path[t]);
}
/* First we use the current cgroup path to chmod and chown the pressure path, then pass the
* path relative to the cgroup namespace to environment variables and mounts. If chown/chmod
* fails, we should not pass pressure path environment variable or read-write mount to the
* unit. This is why we check if pressure_path[t] != NULL in the conditional below. */
if (pressure_path[t] && needs_sandboxing && exec_needs_cgroup_namespace(context)) {
pressure_path[t] = mfree(pressure_path[t]);
r = cg_get_path("/", pressure_file, &pressure_path[t]);
if (r < 0) {
*exit_status = EXIT_MEMORY;
return log_oom();
}
}
} else if (cgroup_context->pressure[t].watch == CGROUP_PRESSURE_WATCH_NO) {
pressure_path[t] = strdup("/dev/null"); /* /dev/null is explicit indicator for turning off pressure watch */
if (!pressure_path[t]) {
*exit_status = EXIT_MEMORY;
return log_oom();
}
} }
} }
} }
@ -5829,7 +5867,7 @@ int exec_invoke(
shell, shell,
journal_stream_dev, journal_stream_dev,
journal_stream_ino, journal_stream_ino,
memory_pressure_path, pressure_path,
needs_sandboxing, needs_sandboxing,
&our_env); &our_env);
if (r < 0) { if (r < 0) {
@ -6047,7 +6085,7 @@ int exec_invoke(
runtime, runtime,
&rootfs, &rootfs,
/* delegate= */ false, /* delegate= */ false,
memory_pressure_path, pressure_path,
uid, uid,
gid, gid,
command, command,
@ -6144,7 +6182,7 @@ int exec_invoke(
runtime, runtime,
&rootfs, &rootfs,
/* delegate= */ true, /* delegate= */ true,
memory_pressure_path, pressure_path,
uid, uid,
gid, gid,
command, command,

View File

@ -279,7 +279,15 @@ static int exec_cgroup_context_serialize(const CGroupContext *c, FILE *f) {
if (r < 0) if (r < 0)
return r; return r;
r = serialize_item(f, "exec-cgroup-context-memory-pressure-watch", cgroup_pressure_watch_to_string(c->memory_pressure_watch)); r = serialize_item(f, "exec-cgroup-context-memory-pressure-watch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_MEMORY].watch));
if (r < 0)
return r;
r = serialize_item(f, "exec-cgroup-context-cpu-pressure-watch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_CPU].watch));
if (r < 0)
return r;
r = serialize_item(f, "exec-cgroup-context-io-pressure-watch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_IO].watch));
if (r < 0) if (r < 0)
return r; return r;
@ -287,8 +295,20 @@ static int exec_cgroup_context_serialize(const CGroupContext *c, FILE *f) {
if (r < 0) if (r < 0)
return r; return r;
if (c->memory_pressure_threshold_usec != USEC_INFINITY) { if (c->pressure[PRESSURE_MEMORY].threshold_usec != USEC_INFINITY) {
r = serialize_usec(f, "exec-cgroup-context-memory-pressure-threshold-usec", c->memory_pressure_threshold_usec); r = serialize_usec(f, "exec-cgroup-context-memory-pressure-threshold-usec", c->pressure[PRESSURE_MEMORY].threshold_usec);
if (r < 0)
return r;
}
if (c->pressure[PRESSURE_CPU].threshold_usec != USEC_INFINITY) {
r = serialize_usec(f, "exec-cgroup-context-cpu-pressure-threshold-usec", c->pressure[PRESSURE_CPU].threshold_usec);
if (r < 0)
return r;
}
if (c->pressure[PRESSURE_IO].threshold_usec != USEC_INFINITY) {
r = serialize_usec(f, "exec-cgroup-context-io-pressure-threshold-usec", c->pressure[PRESSURE_IO].threshold_usec);
if (r < 0) if (r < 0)
return r; return r;
} }
@ -621,15 +641,31 @@ static int exec_cgroup_context_deserialize(CGroupContext *c, FILE *f) {
if (r < 0) if (r < 0)
return r; return r;
} else if ((val = startswith(l, "exec-cgroup-context-memory-pressure-watch="))) { } else if ((val = startswith(l, "exec-cgroup-context-memory-pressure-watch="))) {
c->memory_pressure_watch = cgroup_pressure_watch_from_string(val); c->pressure[PRESSURE_MEMORY].watch = cgroup_pressure_watch_from_string(val);
if (c->memory_pressure_watch < 0) if (c->pressure[PRESSURE_MEMORY].watch < 0)
return -EINVAL;
} else if ((val = startswith(l, "exec-cgroup-context-cpu-pressure-watch="))) {
c->pressure[PRESSURE_CPU].watch = cgroup_pressure_watch_from_string(val);
if (c->pressure[PRESSURE_CPU].watch < 0)
return -EINVAL;
} else if ((val = startswith(l, "exec-cgroup-context-io-pressure-watch="))) {
c->pressure[PRESSURE_IO].watch = cgroup_pressure_watch_from_string(val);
if (c->pressure[PRESSURE_IO].watch < 0)
return -EINVAL; return -EINVAL;
} else if ((val = startswith(l, "exec-cgroup-context-delegate-subgroup="))) { } else if ((val = startswith(l, "exec-cgroup-context-delegate-subgroup="))) {
r = free_and_strdup(&c->delegate_subgroup, val); r = free_and_strdup(&c->delegate_subgroup, val);
if (r < 0) if (r < 0)
return r; return r;
} else if ((val = startswith(l, "exec-cgroup-context-memory-pressure-threshold-usec="))) { } else if ((val = startswith(l, "exec-cgroup-context-memory-pressure-threshold-usec="))) {
r = deserialize_usec(val, &c->memory_pressure_threshold_usec); r = deserialize_usec(val, &c->pressure[PRESSURE_MEMORY].threshold_usec);
if (r < 0)
return r;
} else if ((val = startswith(l, "exec-cgroup-context-cpu-pressure-threshold-usec="))) {
r = deserialize_usec(val, &c->pressure[PRESSURE_CPU].threshold_usec);
if (r < 0)
return r;
} else if ((val = startswith(l, "exec-cgroup-context-io-pressure-threshold-usec="))) {
r = deserialize_usec(val, &c->pressure[PRESSURE_IO].threshold_usec);
if (r < 0) if (r < 0)
return r; return r;
} else if ((val = startswith(l, "exec-cgroup-context-device-allow="))) { } else if ((val = startswith(l, "exec-cgroup-context-device-allow="))) {

View File

@ -276,8 +276,12 @@
{{type}}.SocketBindAllow, config_parse_cgroup_socket_bind, 0, offsetof({{type}}, cgroup_context.socket_bind_allow) {{type}}.SocketBindAllow, config_parse_cgroup_socket_bind, 0, offsetof({{type}}, cgroup_context.socket_bind_allow)
{{type}}.SocketBindDeny, config_parse_cgroup_socket_bind, 0, offsetof({{type}}, cgroup_context.socket_bind_deny) {{type}}.SocketBindDeny, config_parse_cgroup_socket_bind, 0, offsetof({{type}}, cgroup_context.socket_bind_deny)
{{type}}.RestrictNetworkInterfaces, config_parse_restrict_network_interfaces, 0, offsetof({{type}}, cgroup_context) {{type}}.RestrictNetworkInterfaces, config_parse_restrict_network_interfaces, 0, offsetof({{type}}, cgroup_context)
{{type}}.MemoryPressureThresholdSec, config_parse_sec, 0, offsetof({{type}}, cgroup_context.memory_pressure_threshold_usec) {{type}}.MemoryPressureThresholdSec, config_parse_sec, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_MEMORY].threshold_usec)
{{type}}.MemoryPressureWatch, config_parse_memory_pressure_watch, 0, offsetof({{type}}, cgroup_context.memory_pressure_watch) {{type}}.MemoryPressureWatch, config_parse_pressure_watch, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_MEMORY].watch)
{{type}}.CPUPressureThresholdSec, config_parse_sec, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_CPU].threshold_usec)
{{type}}.CPUPressureWatch, config_parse_pressure_watch, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_CPU].watch)
{{type}}.IOPressureThresholdSec, config_parse_sec, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_IO].threshold_usec)
{{type}}.IOPressureWatch, config_parse_pressure_watch, 0, offsetof({{type}}, cgroup_context.pressure[PRESSURE_IO].watch)
{{type}}.NFTSet, config_parse_cgroup_nft_set, NFT_SET_PARSE_CGROUP, offsetof({{type}}, cgroup_context) {{type}}.NFTSet, config_parse_cgroup_nft_set, NFT_SET_PARSE_CGROUP, offsetof({{type}}, cgroup_context)
{{type}}.CoredumpReceive, config_parse_bool, 0, offsetof({{type}}, cgroup_context.coredump_receive) {{type}}.CoredumpReceive, config_parse_bool, 0, offsetof({{type}}, cgroup_context.coredump_receive)
{{type}}.BindNetworkInterface, config_parse_bind_network_interface, 0, offsetof({{type}}, cgroup_context) {{type}}.BindNetworkInterface, config_parse_bind_network_interface, 0, offsetof({{type}}, cgroup_context)

View File

@ -154,7 +154,7 @@ DEFINE_CONFIG_PARSE_ENUM(config_parse_service_timeout_failure_mode, service_time
DEFINE_CONFIG_PARSE_ENUM(config_parse_socket_bind, socket_address_bind_ipv6_only_or_bool, SocketAddressBindIPv6Only); DEFINE_CONFIG_PARSE_ENUM(config_parse_socket_bind, socket_address_bind_ipv6_only_or_bool, SocketAddressBindIPv6Only);
DEFINE_CONFIG_PARSE_ENUM(config_parse_oom_policy, oom_policy, OOMPolicy); DEFINE_CONFIG_PARSE_ENUM(config_parse_oom_policy, oom_policy, OOMPolicy);
DEFINE_CONFIG_PARSE_ENUM(config_parse_managed_oom_preference, managed_oom_preference, ManagedOOMPreference); DEFINE_CONFIG_PARSE_ENUM(config_parse_managed_oom_preference, managed_oom_preference, ManagedOOMPreference);
DEFINE_CONFIG_PARSE_ENUM(config_parse_memory_pressure_watch, cgroup_pressure_watch, CGroupPressureWatch); DEFINE_CONFIG_PARSE_ENUM(config_parse_pressure_watch, cgroup_pressure_watch, CGroupPressureWatch);
DEFINE_CONFIG_PARSE_ENUM_WITH_DEFAULT(config_parse_ip_tos, ip_tos, int, -1); DEFINE_CONFIG_PARSE_ENUM_WITH_DEFAULT(config_parse_ip_tos, ip_tos, int, -1);
DEFINE_CONFIG_PARSE_PTR(config_parse_cg_weight, cg_weight_parse, uint64_t); DEFINE_CONFIG_PARSE_PTR(config_parse_cg_weight, cg_weight_parse, uint64_t);
DEFINE_CONFIG_PARSE_PTR(config_parse_cg_cpu_weight, cg_cpu_weight_parse, uint64_t); DEFINE_CONFIG_PARSE_PTR(config_parse_cg_cpu_weight, cg_cpu_weight_parse, uint64_t);

View File

@ -164,7 +164,7 @@ CONFIG_PARSER_PROTOTYPE(config_parse_watchdog_sec);
CONFIG_PARSER_PROTOTYPE(config_parse_tty_size); CONFIG_PARSER_PROTOTYPE(config_parse_tty_size);
CONFIG_PARSER_PROTOTYPE(config_parse_log_filter_patterns); CONFIG_PARSER_PROTOTYPE(config_parse_log_filter_patterns);
CONFIG_PARSER_PROTOTYPE(config_parse_open_file); CONFIG_PARSER_PROTOTYPE(config_parse_open_file);
CONFIG_PARSER_PROTOTYPE(config_parse_memory_pressure_watch); CONFIG_PARSER_PROTOTYPE(config_parse_pressure_watch);
CONFIG_PARSER_PROTOTYPE(config_parse_cgroup_nft_set); CONFIG_PARSER_PROTOTYPE(config_parse_cgroup_nft_set);
CONFIG_PARSER_PROTOTYPE(config_parse_mount_node); CONFIG_PARSER_PROTOTYPE(config_parse_mount_node);
CONFIG_PARSER_PROTOTYPE(config_parse_concurrency_max); CONFIG_PARSER_PROTOTYPE(config_parse_concurrency_max);

View File

@ -747,89 +747,93 @@ static int config_parse_crash_reboot(
static int parse_config_file(void) { static int parse_config_file(void) {
const ConfigTableItem items[] = { const ConfigTableItem items[] = {
{ "Manager", "LogLevel", config_parse_level2, 0, NULL }, { "Manager", "LogLevel", config_parse_level2, 0, NULL },
{ "Manager", "LogTarget", config_parse_target, 0, NULL }, { "Manager", "LogTarget", config_parse_target, 0, NULL },
{ "Manager", "LogColor", config_parse_color, 0, NULL }, { "Manager", "LogColor", config_parse_color, 0, NULL },
{ "Manager", "LogLocation", config_parse_location, 0, NULL }, { "Manager", "LogLocation", config_parse_location, 0, NULL },
{ "Manager", "LogTime", config_parse_time, 0, NULL }, { "Manager", "LogTime", config_parse_time, 0, NULL },
{ "Manager", "DumpCore", config_parse_bool, 0, &arg_dump_core }, { "Manager", "DumpCore", config_parse_bool, 0, &arg_dump_core },
{ "Manager", "CrashChVT", /* legacy */ config_parse_crash_chvt, 0, &arg_crash_chvt }, { "Manager", "CrashChVT", /* legacy */ config_parse_crash_chvt, 0, &arg_crash_chvt },
{ "Manager", "CrashChangeVT", config_parse_crash_chvt, 0, &arg_crash_chvt }, { "Manager", "CrashChangeVT", config_parse_crash_chvt, 0, &arg_crash_chvt },
{ "Manager", "CrashShell", config_parse_bool, 0, &arg_crash_shell }, { "Manager", "CrashShell", config_parse_bool, 0, &arg_crash_shell },
{ "Manager", "CrashReboot", config_parse_crash_reboot, 0, &arg_crash_action }, { "Manager", "CrashReboot", config_parse_crash_reboot, 0, &arg_crash_action },
{ "Manager", "CrashAction", config_parse_crash_action, 0, &arg_crash_action }, { "Manager", "CrashAction", config_parse_crash_action, 0, &arg_crash_action },
{ "Manager", "ShowStatus", config_parse_show_status, 0, &arg_show_status }, { "Manager", "ShowStatus", config_parse_show_status, 0, &arg_show_status },
{ "Manager", "StatusUnitFormat", config_parse_status_unit_format, 0, &arg_status_unit_format }, { "Manager", "StatusUnitFormat", config_parse_status_unit_format, 0, &arg_status_unit_format },
{ "Manager", "CPUAffinity", config_parse_cpu_set, 0, &arg_cpu_affinity }, { "Manager", "CPUAffinity", config_parse_cpu_set, 0, &arg_cpu_affinity },
{ "Manager", "NUMAPolicy", config_parse_numa_policy, 0, &arg_numa_policy.type }, { "Manager", "NUMAPolicy", config_parse_numa_policy, 0, &arg_numa_policy.type },
{ "Manager", "NUMAMask", config_parse_numa_mask, 0, &arg_numa_policy.nodes }, { "Manager", "NUMAMask", config_parse_numa_mask, 0, &arg_numa_policy.nodes },
{ "Manager", "JoinControllers", config_parse_warn_compat, DISABLED_LEGACY, NULL }, { "Manager", "JoinControllers", config_parse_warn_compat, DISABLED_LEGACY, NULL },
{ "Manager", "RuntimeWatchdogSec", config_parse_watchdog_sec, 0, &arg_runtime_watchdog }, { "Manager", "RuntimeWatchdogSec", config_parse_watchdog_sec, 0, &arg_runtime_watchdog },
{ "Manager", "RuntimeWatchdogPreSec", config_parse_watchdog_sec, 0, &arg_pretimeout_watchdog }, { "Manager", "RuntimeWatchdogPreSec", config_parse_watchdog_sec, 0, &arg_pretimeout_watchdog },
{ "Manager", "RebootWatchdogSec", config_parse_watchdog_sec, 0, &arg_reboot_watchdog }, { "Manager", "RebootWatchdogSec", config_parse_watchdog_sec, 0, &arg_reboot_watchdog },
{ "Manager", "ShutdownWatchdogSec", config_parse_watchdog_sec, 0, &arg_reboot_watchdog }, /* obsolete alias */ { "Manager", "ShutdownWatchdogSec", config_parse_watchdog_sec, 0, &arg_reboot_watchdog }, /* obsolete alias */
{ "Manager", "KExecWatchdogSec", config_parse_watchdog_sec, 0, &arg_kexec_watchdog }, { "Manager", "KExecWatchdogSec", config_parse_watchdog_sec, 0, &arg_kexec_watchdog },
{ "Manager", "WatchdogDevice", config_parse_path, 0, &arg_watchdog_device }, { "Manager", "WatchdogDevice", config_parse_path, 0, &arg_watchdog_device },
{ "Manager", "RuntimeWatchdogPreGovernor", config_parse_string, CONFIG_PARSE_STRING_SAFE, &arg_watchdog_pretimeout_governor }, { "Manager", "RuntimeWatchdogPreGovernor", config_parse_string, CONFIG_PARSE_STRING_SAFE, &arg_watchdog_pretimeout_governor },
{ "Manager", "CapabilityBoundingSet", config_parse_capability_set, 0, &arg_capability_bounding_set }, { "Manager", "CapabilityBoundingSet", config_parse_capability_set, 0, &arg_capability_bounding_set },
{ "Manager", "NoNewPrivileges", config_parse_bool, 0, &arg_no_new_privs }, { "Manager", "NoNewPrivileges", config_parse_bool, 0, &arg_no_new_privs },
{ "Manager", "ProtectSystem", config_parse_protect_system_pid1, 0, &arg_protect_system }, { "Manager", "ProtectSystem", config_parse_protect_system_pid1, 0, &arg_protect_system },
#if HAVE_SECCOMP #if HAVE_SECCOMP
{ "Manager", "SystemCallArchitectures", config_parse_syscall_archs, 0, &arg_syscall_archs }, { "Manager", "SystemCallArchitectures", config_parse_syscall_archs, 0, &arg_syscall_archs },
#else #else
{ "Manager", "SystemCallArchitectures", config_parse_warn_compat, DISABLED_CONFIGURATION, NULL }, { "Manager", "SystemCallArchitectures", config_parse_warn_compat, DISABLED_CONFIGURATION, NULL },
#endif #endif
{ "Manager", "TimerSlackNSec", config_parse_nsec, 0, &arg_timer_slack_nsec }, { "Manager", "TimerSlackNSec", config_parse_nsec, 0, &arg_timer_slack_nsec },
{ "Manager", "DefaultTimerAccuracySec", config_parse_sec, 0, &arg_defaults.timer_accuracy_usec }, { "Manager", "DefaultTimerAccuracySec", config_parse_sec, 0, &arg_defaults.timer_accuracy_usec },
{ "Manager", "DefaultStandardOutput", config_parse_output_restricted, 0, &arg_defaults.std_output }, { "Manager", "DefaultStandardOutput", config_parse_output_restricted, 0, &arg_defaults.std_output },
{ "Manager", "DefaultStandardError", config_parse_output_restricted, 0, &arg_defaults.std_error }, { "Manager", "DefaultStandardError", config_parse_output_restricted, 0, &arg_defaults.std_error },
{ "Manager", "DefaultTimeoutStartSec", config_parse_sec, 0, &arg_defaults.timeout_start_usec }, { "Manager", "DefaultTimeoutStartSec", config_parse_sec, 0, &arg_defaults.timeout_start_usec },
{ "Manager", "DefaultTimeoutStopSec", config_parse_sec, 0, &arg_defaults.timeout_stop_usec }, { "Manager", "DefaultTimeoutStopSec", config_parse_sec, 0, &arg_defaults.timeout_stop_usec },
{ "Manager", "DefaultTimeoutAbortSec", config_parse_default_timeout_abort, 0, NULL }, { "Manager", "DefaultTimeoutAbortSec", config_parse_default_timeout_abort, 0, NULL },
{ "Manager", "DefaultDeviceTimeoutSec", config_parse_sec, 0, &arg_defaults.device_timeout_usec }, { "Manager", "DefaultDeviceTimeoutSec", config_parse_sec, 0, &arg_defaults.device_timeout_usec },
{ "Manager", "DefaultRestartSec", config_parse_sec, 0, &arg_defaults.restart_usec }, { "Manager", "DefaultRestartSec", config_parse_sec, 0, &arg_defaults.restart_usec },
{ "Manager", "DefaultStartLimitInterval", config_parse_sec, 0, &arg_defaults.start_limit.interval }, /* obsolete alias */ { "Manager", "DefaultStartLimitInterval", config_parse_sec, 0, &arg_defaults.start_limit.interval }, /* obsolete alias */
{ "Manager", "DefaultStartLimitIntervalSec", config_parse_sec, 0, &arg_defaults.start_limit.interval }, { "Manager", "DefaultStartLimitIntervalSec", config_parse_sec, 0, &arg_defaults.start_limit.interval },
{ "Manager", "DefaultStartLimitBurst", config_parse_unsigned, 0, &arg_defaults.start_limit.burst }, { "Manager", "DefaultStartLimitBurst", config_parse_unsigned, 0, &arg_defaults.start_limit.burst },
{ "Manager", "DefaultRestrictSUIDSGID", config_parse_bool, 0, &arg_defaults.restrict_suid_sgid }, { "Manager", "DefaultRestrictSUIDSGID", config_parse_bool, 0, &arg_defaults.restrict_suid_sgid },
{ "Manager", "DefaultEnvironment", config_parse_environ, arg_runtime_scope, &arg_default_environment }, { "Manager", "DefaultEnvironment", config_parse_environ, arg_runtime_scope, &arg_default_environment },
{ "Manager", "ManagerEnvironment", config_parse_environ, arg_runtime_scope, &arg_manager_environment }, { "Manager", "ManagerEnvironment", config_parse_environ, arg_runtime_scope, &arg_manager_environment },
{ "Manager", "DefaultLimitCPU", config_parse_rlimit, RLIMIT_CPU, arg_defaults.rlimit }, { "Manager", "DefaultLimitCPU", config_parse_rlimit, RLIMIT_CPU, arg_defaults.rlimit },
{ "Manager", "DefaultLimitFSIZE", config_parse_rlimit, RLIMIT_FSIZE, arg_defaults.rlimit }, { "Manager", "DefaultLimitFSIZE", config_parse_rlimit, RLIMIT_FSIZE, arg_defaults.rlimit },
{ "Manager", "DefaultLimitDATA", config_parse_rlimit, RLIMIT_DATA, arg_defaults.rlimit }, { "Manager", "DefaultLimitDATA", config_parse_rlimit, RLIMIT_DATA, arg_defaults.rlimit },
{ "Manager", "DefaultLimitSTACK", config_parse_rlimit, RLIMIT_STACK, arg_defaults.rlimit }, { "Manager", "DefaultLimitSTACK", config_parse_rlimit, RLIMIT_STACK, arg_defaults.rlimit },
{ "Manager", "DefaultLimitCORE", config_parse_rlimit, RLIMIT_CORE, arg_defaults.rlimit }, { "Manager", "DefaultLimitCORE", config_parse_rlimit, RLIMIT_CORE, arg_defaults.rlimit },
{ "Manager", "DefaultLimitRSS", config_parse_rlimit, RLIMIT_RSS, arg_defaults.rlimit }, { "Manager", "DefaultLimitRSS", config_parse_rlimit, RLIMIT_RSS, arg_defaults.rlimit },
{ "Manager", "DefaultLimitNOFILE", config_parse_rlimit, RLIMIT_NOFILE, arg_defaults.rlimit }, { "Manager", "DefaultLimitNOFILE", config_parse_rlimit, RLIMIT_NOFILE, arg_defaults.rlimit },
{ "Manager", "DefaultLimitAS", config_parse_rlimit, RLIMIT_AS, arg_defaults.rlimit }, { "Manager", "DefaultLimitAS", config_parse_rlimit, RLIMIT_AS, arg_defaults.rlimit },
{ "Manager", "DefaultLimitNPROC", config_parse_rlimit, RLIMIT_NPROC, arg_defaults.rlimit }, { "Manager", "DefaultLimitNPROC", config_parse_rlimit, RLIMIT_NPROC, arg_defaults.rlimit },
{ "Manager", "DefaultLimitMEMLOCK", config_parse_rlimit, RLIMIT_MEMLOCK, arg_defaults.rlimit }, { "Manager", "DefaultLimitMEMLOCK", config_parse_rlimit, RLIMIT_MEMLOCK, arg_defaults.rlimit },
{ "Manager", "DefaultLimitLOCKS", config_parse_rlimit, RLIMIT_LOCKS, arg_defaults.rlimit }, { "Manager", "DefaultLimitLOCKS", config_parse_rlimit, RLIMIT_LOCKS, arg_defaults.rlimit },
{ "Manager", "DefaultLimitSIGPENDING", config_parse_rlimit, RLIMIT_SIGPENDING, arg_defaults.rlimit }, { "Manager", "DefaultLimitSIGPENDING", config_parse_rlimit, RLIMIT_SIGPENDING, arg_defaults.rlimit },
{ "Manager", "DefaultLimitMSGQUEUE", config_parse_rlimit, RLIMIT_MSGQUEUE, arg_defaults.rlimit }, { "Manager", "DefaultLimitMSGQUEUE", config_parse_rlimit, RLIMIT_MSGQUEUE, arg_defaults.rlimit },
{ "Manager", "DefaultLimitNICE", config_parse_rlimit, RLIMIT_NICE, arg_defaults.rlimit }, { "Manager", "DefaultLimitNICE", config_parse_rlimit, RLIMIT_NICE, arg_defaults.rlimit },
{ "Manager", "DefaultLimitRTPRIO", config_parse_rlimit, RLIMIT_RTPRIO, arg_defaults.rlimit }, { "Manager", "DefaultLimitRTPRIO", config_parse_rlimit, RLIMIT_RTPRIO, arg_defaults.rlimit },
{ "Manager", "DefaultLimitRTTIME", config_parse_rlimit, RLIMIT_RTTIME, arg_defaults.rlimit }, { "Manager", "DefaultLimitRTTIME", config_parse_rlimit, RLIMIT_RTTIME, arg_defaults.rlimit },
{ "Manager", "DefaultCPUAccounting", config_parse_warn_compat, DISABLED_LEGACY, NULL }, { "Manager", "DefaultCPUAccounting", config_parse_warn_compat, DISABLED_LEGACY, NULL },
{ "Manager", "DefaultIOAccounting", config_parse_bool, 0, &arg_defaults.io_accounting }, { "Manager", "DefaultIOAccounting", config_parse_bool, 0, &arg_defaults.io_accounting },
{ "Manager", "DefaultIPAccounting", config_parse_bool, 0, &arg_defaults.ip_accounting }, { "Manager", "DefaultIPAccounting", config_parse_bool, 0, &arg_defaults.ip_accounting },
{ "Manager", "DefaultBlockIOAccounting", config_parse_warn_compat, DISABLED_LEGACY, NULL }, { "Manager", "DefaultBlockIOAccounting", config_parse_warn_compat, DISABLED_LEGACY, NULL },
{ "Manager", "DefaultMemoryAccounting", config_parse_bool, 0, &arg_defaults.memory_accounting }, { "Manager", "DefaultMemoryAccounting", config_parse_bool, 0, &arg_defaults.memory_accounting },
{ "Manager", "DefaultTasksAccounting", config_parse_bool, 0, &arg_defaults.tasks_accounting }, { "Manager", "DefaultTasksAccounting", config_parse_bool, 0, &arg_defaults.tasks_accounting },
{ "Manager", "DefaultTasksMax", config_parse_tasks_max, 0, &arg_defaults.tasks_max }, { "Manager", "DefaultTasksMax", config_parse_tasks_max, 0, &arg_defaults.tasks_max },
{ "Manager", "DefaultMemoryPressureThresholdSec", config_parse_sec, 0, &arg_defaults.memory_pressure_threshold_usec }, { "Manager", "DefaultMemoryPressureThresholdSec", config_parse_sec, 0, &arg_defaults.pressure[PRESSURE_MEMORY].threshold_usec },
{ "Manager", "DefaultMemoryPressureWatch", config_parse_memory_pressure_watch, 0, &arg_defaults.memory_pressure_watch }, { "Manager", "DefaultMemoryPressureWatch", config_parse_pressure_watch, 0, &arg_defaults.pressure[PRESSURE_MEMORY].watch },
{ "Manager", "CtrlAltDelBurstAction", config_parse_emergency_action, arg_runtime_scope, &arg_cad_burst_action }, { "Manager", "DefaultCPUPressureThresholdSec", config_parse_sec, 0, &arg_defaults.pressure[PRESSURE_CPU].threshold_usec },
{ "Manager", "DefaultOOMPolicy", config_parse_oom_policy, 0, &arg_defaults.oom_policy }, { "Manager", "DefaultCPUPressureWatch", config_parse_pressure_watch, 0, &arg_defaults.pressure[PRESSURE_CPU].watch },
{ "Manager", "DefaultOOMScoreAdjust", config_parse_oom_score_adjust, 0, NULL }, { "Manager", "DefaultIOPressureThresholdSec", config_parse_sec, 0, &arg_defaults.pressure[PRESSURE_IO].threshold_usec },
{ "Manager", "ReloadLimitIntervalSec", config_parse_sec, 0, &arg_reload_limit_interval_sec }, { "Manager", "DefaultIOPressureWatch", config_parse_pressure_watch, 0, &arg_defaults.pressure[PRESSURE_IO].watch },
{ "Manager", "ReloadLimitBurst", config_parse_unsigned, 0, &arg_reload_limit_burst }, { "Manager", "CtrlAltDelBurstAction", config_parse_emergency_action, arg_runtime_scope, &arg_cad_burst_action },
{ "Manager", "DefaultMemoryZSwapWriteback", config_parse_bool, 0, &arg_defaults.memory_zswap_writeback }, { "Manager", "DefaultOOMPolicy", config_parse_oom_policy, 0, &arg_defaults.oom_policy },
{ "Manager", "MinimumUptimeSec", config_parse_sec, 0, &arg_minimum_uptime_usec }, { "Manager", "DefaultOOMScoreAdjust", config_parse_oom_score_adjust, 0, NULL },
{ "Manager", "ReloadLimitIntervalSec", config_parse_sec, 0, &arg_reload_limit_interval_sec },
{ "Manager", "ReloadLimitBurst", config_parse_unsigned, 0, &arg_reload_limit_burst },
{ "Manager", "DefaultMemoryZSwapWriteback", config_parse_bool, 0, &arg_defaults.memory_zswap_writeback },
{ "Manager", "MinimumUptimeSec", config_parse_sec, 0, &arg_minimum_uptime_usec },
#if ENABLE_SMACK #if ENABLE_SMACK
{ "Manager", "DefaultSmackProcessLabel", config_parse_string, 0, &arg_defaults.smack_process_label }, { "Manager", "DefaultSmackProcessLabel", config_parse_string, 0, &arg_defaults.smack_process_label },
#else #else
{ "Manager", "DefaultSmackProcessLabel", config_parse_warn_compat, DISABLED_CONFIGURATION, NULL }, { "Manager", "DefaultSmackProcessLabel", config_parse_warn_compat, DISABLED_CONFIGURATION, NULL },
#endif #endif
{} {}
}; };

View File

@ -616,9 +616,13 @@ static char** sanitize_environment(char **l) {
l, l,
"CACHE_DIRECTORY", "CACHE_DIRECTORY",
"CONFIGURATION_DIRECTORY", "CONFIGURATION_DIRECTORY",
"CPU_PRESSURE_WATCH",
"CPU_PRESSURE_WRITE",
"CREDENTIALS_DIRECTORY", "CREDENTIALS_DIRECTORY",
"EXIT_CODE", "EXIT_CODE",
"EXIT_STATUS", "EXIT_STATUS",
"IO_PRESSURE_WATCH",
"IO_PRESSURE_WRITE",
"INVOCATION_ID", "INVOCATION_ID",
"JOURNAL_STREAM", "JOURNAL_STREAM",
"LISTEN_FDNAMES", "LISTEN_FDNAMES",
@ -796,26 +800,38 @@ static int manager_setup_sigchld_event_source(Manager *m) {
return 0; return 0;
} }
int manager_setup_memory_pressure_event_source(Manager *m) { typedef int (*pressure_add_t)(sd_event *, sd_event_source **, sd_event_handler_t, void *);
typedef int (*pressure_set_period_t)(sd_event_source *, usec_t, usec_t);
static const struct {
pressure_add_t add;
pressure_set_period_t set_period;
} pressure_dispatch_table[_PRESSURE_RESOURCE_MAX] = {
[PRESSURE_MEMORY] = { sd_event_add_memory_pressure, sd_event_source_set_memory_pressure_period },
[PRESSURE_CPU] = { sd_event_add_cpu_pressure, sd_event_source_set_cpu_pressure_period },
[PRESSURE_IO] = { sd_event_add_io_pressure, sd_event_source_set_io_pressure_period },
};
int manager_setup_pressure_event_source(Manager *m, PressureResource t) {
int r; int r;
assert(m); assert(m);
assert(t >= 0 && t < _PRESSURE_RESOURCE_MAX);
m->memory_pressure_event_source = sd_event_source_disable_unref(m->memory_pressure_event_source); m->pressure_event_source[t] = sd_event_source_disable_unref(m->pressure_event_source[t]);
r = sd_event_add_memory_pressure(m->event, &m->memory_pressure_event_source, NULL, NULL); r = pressure_dispatch_table[t].add(m->event, &m->pressure_event_source[t], NULL, NULL);
if (r < 0) if (r < 0)
log_full_errno(ERRNO_IS_NOT_SUPPORTED(r) || ERRNO_IS_PRIVILEGE(r) || (r == -EHOSTDOWN) ? LOG_DEBUG : LOG_NOTICE, r, log_full_errno(ERRNO_IS_NOT_SUPPORTED(r) || ERRNO_IS_PRIVILEGE(r) || (r == -EHOSTDOWN) ? LOG_DEBUG : LOG_NOTICE, r,
"Failed to establish memory pressure event source, ignoring: %m"); "Failed to establish %s pressure event source, ignoring: %m", pressure_resource_to_string(t));
else if (m->defaults.memory_pressure_threshold_usec != USEC_INFINITY) { else if (m->defaults.pressure[t].threshold_usec != USEC_INFINITY) {
/* If there's a default memory pressure threshold set, also apply it to the service manager itself */ r = pressure_dispatch_table[t].set_period(
r = sd_event_source_set_memory_pressure_period( m->pressure_event_source[t],
m->memory_pressure_event_source, m->defaults.pressure[t].threshold_usec,
m->defaults.memory_pressure_threshold_usec, PRESSURE_DEFAULT_WINDOW_USEC);
MEMORY_PRESSURE_DEFAULT_WINDOW_USEC);
if (r < 0) if (r < 0)
log_warning_errno(r, "Failed to adjust memory pressure threshold, ignoring: %m"); log_warning_errno(r, "Failed to adjust %s pressure threshold, ignoring: %m", pressure_resource_to_string(t));
} }
return 0; return 0;
@ -1001,9 +1017,11 @@ int manager_new(RuntimeScope runtime_scope, ManagerTestRunFlags test_run_flags,
if (r < 0) if (r < 0)
return r; return r;
r = manager_setup_memory_pressure_event_source(m); for (PressureResource t = 0; t < _PRESSURE_RESOURCE_MAX; t++) {
if (r < 0) r = manager_setup_pressure_event_source(m, t);
return r; if (r < 0)
return r;
}
#if HAVE_LIBBPF #if HAVE_LIBBPF
if (MANAGER_IS_SYSTEM(m) && bpf_restrict_fs_supported(/* initialize= */ true)) { if (MANAGER_IS_SYSTEM(m) && bpf_restrict_fs_supported(/* initialize= */ true)) {
@ -1711,7 +1729,8 @@ Manager* manager_free(Manager *m) {
sd_event_source_unref(m->user_lookup_event_source); sd_event_source_unref(m->user_lookup_event_source);
sd_event_source_unref(m->handoff_timestamp_event_source); sd_event_source_unref(m->handoff_timestamp_event_source);
sd_event_source_unref(m->pidref_event_source); sd_event_source_unref(m->pidref_event_source);
sd_event_source_unref(m->memory_pressure_event_source); FOREACH_ARRAY(pressure_event_source, m->pressure_event_source, _PRESSURE_RESOURCE_MAX)
sd_event_source_unref(*pressure_event_source);
safe_close(m->signal_fd); safe_close(m->signal_fd);
safe_close(m->notify_fd); safe_close(m->notify_fd);
@ -4300,8 +4319,7 @@ int manager_set_unit_defaults(Manager *m, const UnitDefaults *defaults) {
m->defaults.oom_score_adjust = defaults->oom_score_adjust; m->defaults.oom_score_adjust = defaults->oom_score_adjust;
m->defaults.oom_score_adjust_set = defaults->oom_score_adjust_set; m->defaults.oom_score_adjust_set = defaults->oom_score_adjust_set;
m->defaults.memory_pressure_watch = defaults->memory_pressure_watch; memcpy(m->defaults.pressure, defaults->pressure, sizeof(m->defaults.pressure));
m->defaults.memory_pressure_threshold_usec = defaults->memory_pressure_threshold_usec;
m->defaults.memory_zswap_writeback = defaults->memory_zswap_writeback; m->defaults.memory_zswap_writeback = defaults->memory_zswap_writeback;
@ -5195,8 +5213,11 @@ void unit_defaults_init(UnitDefaults *defaults, RuntimeScope scope) {
.tasks_max = DEFAULT_TASKS_MAX, .tasks_max = DEFAULT_TASKS_MAX,
.timer_accuracy_usec = 1 * USEC_PER_MINUTE, .timer_accuracy_usec = 1 * USEC_PER_MINUTE,
.memory_pressure_watch = CGROUP_PRESSURE_WATCH_AUTO, .pressure = {
.memory_pressure_threshold_usec = MEMORY_PRESSURE_DEFAULT_THRESHOLD_USEC, [PRESSURE_MEMORY] = { .watch = CGROUP_PRESSURE_WATCH_AUTO, .threshold_usec = PRESSURE_DEFAULT_THRESHOLD_USEC },
[PRESSURE_CPU] = { .watch = CGROUP_PRESSURE_WATCH_AUTO, .threshold_usec = PRESSURE_DEFAULT_THRESHOLD_USEC },
[PRESSURE_IO] = { .watch = CGROUP_PRESSURE_WATCH_AUTO, .threshold_usec = PRESSURE_DEFAULT_THRESHOLD_USEC },
},
.oom_policy = OOM_STOP, .oom_policy = OOM_STOP,
.oom_score_adjust_set = false, .oom_score_adjust_set = false,

View File

@ -149,8 +149,7 @@ typedef struct UnitDefaults {
bool memory_zswap_writeback; bool memory_zswap_writeback;
CGroupPressureWatch memory_pressure_watch; CGroupPressure pressure[_PRESSURE_RESOURCE_MAX];
usec_t memory_pressure_threshold_usec;
char *smack_process_label; char *smack_process_label;
@ -481,7 +480,7 @@ typedef struct Manager {
/* Dump*() are slow, so always rate limit them to 10 per 10 minutes */ /* Dump*() are slow, so always rate limit them to 10 per 10 minutes */
RateLimit dump_ratelimit; RateLimit dump_ratelimit;
sd_event_source *memory_pressure_event_source; sd_event_source *pressure_event_source[_PRESSURE_RESOURCE_MAX];
/* For NFTSet= */ /* For NFTSet= */
sd_netlink *nfnl; sd_netlink *nfnl;
@ -562,7 +561,7 @@ void manager_unwatch_pidref(Manager *m, const PidRef *pid);
unsigned manager_dispatch_load_queue(Manager *m); unsigned manager_dispatch_load_queue(Manager *m);
int manager_setup_memory_pressure_event_source(Manager *m); int manager_setup_pressure_event_source(Manager *m, PressureResource t);
int manager_default_environment(Manager *m); int manager_default_environment(Manager *m);
int manager_transient_environment_add(Manager *m, char **plus); int manager_transient_environment_add(Manager *m, char **plus);

View File

@ -78,6 +78,10 @@
#DefaultLimitRTTIME= #DefaultLimitRTTIME=
#DefaultMemoryPressureThresholdSec=200ms #DefaultMemoryPressureThresholdSec=200ms
#DefaultMemoryPressureWatch=auto #DefaultMemoryPressureWatch=auto
#DefaultCPUPressureThresholdSec=200ms
#DefaultCPUPressureWatch=auto
#DefaultIOPressureThresholdSec=200ms
#DefaultIOPressureWatch=auto
#DefaultOOMPolicy=stop #DefaultOOMPolicy=stop
#DefaultSmackProcessLabel= #DefaultSmackProcessLabel=
#DefaultRestrictSUIDSGID= #DefaultRestrictSUIDSGID=

View File

@ -178,10 +178,9 @@ static void unit_init(Unit *u) {
if (u->type != UNIT_SLICE) if (u->type != UNIT_SLICE)
cc->tasks_max = u->manager->defaults.tasks_max; cc->tasks_max = u->manager->defaults.tasks_max;
cc->memory_pressure_watch = u->manager->defaults.memory_pressure_watch;
cc->memory_pressure_threshold_usec = u->manager->defaults.memory_pressure_threshold_usec;
cc->memory_zswap_writeback = u->manager->defaults.memory_zswap_writeback; cc->memory_zswap_writeback = u->manager->defaults.memory_zswap_writeback;
memcpy(cc->pressure, u->manager->defaults.pressure, sizeof(cc->pressure));
} }
ec = unit_get_exec_context(u); ec = unit_get_exec_context(u);

View File

@ -54,6 +54,10 @@
#DefaultLimitRTTIME= #DefaultLimitRTTIME=
#DefaultMemoryPressureThresholdSec=200ms #DefaultMemoryPressureThresholdSec=200ms
#DefaultMemoryPressureWatch=auto #DefaultMemoryPressureWatch=auto
#DefaultCPUPressureThresholdSec=200ms
#DefaultCPUPressureWatch=auto
#DefaultIOPressureThresholdSec=200ms
#DefaultIOPressureWatch=auto
#DefaultSmackProcessLabel= #DefaultSmackProcessLabel=
#DefaultRestrictSUIDSGID= #DefaultRestrictSUIDSGID=
#ReloadLimitIntervalSec= #ReloadLimitIntervalSec=

View File

@ -323,8 +323,12 @@ int unit_cgroup_context_build_json(sd_json_variant **ret, const char *name, void
JSON_BUILD_PAIR_UNSIGNED_NON_ZERO("ManagedOOMMemoryPressureLimit", c->moom_mem_pressure_limit), JSON_BUILD_PAIR_UNSIGNED_NON_ZERO("ManagedOOMMemoryPressureLimit", c->moom_mem_pressure_limit),
JSON_BUILD_PAIR_FINITE_USEC("ManagedOOMMemoryPressureDurationUSec", c->moom_mem_pressure_duration_usec), JSON_BUILD_PAIR_FINITE_USEC("ManagedOOMMemoryPressureDurationUSec", c->moom_mem_pressure_duration_usec),
SD_JSON_BUILD_PAIR_STRING("ManagedOOMPreference", managed_oom_preference_to_string(c->moom_preference)), SD_JSON_BUILD_PAIR_STRING("ManagedOOMPreference", managed_oom_preference_to_string(c->moom_preference)),
SD_JSON_BUILD_PAIR_STRING("MemoryPressureWatch", cgroup_pressure_watch_to_string(c->memory_pressure_watch)), SD_JSON_BUILD_PAIR_STRING("MemoryPressureWatch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_MEMORY].watch)),
JSON_BUILD_PAIR_FINITE_USEC("MemoryPressureThresholdUSec", c->memory_pressure_threshold_usec), JSON_BUILD_PAIR_FINITE_USEC("MemoryPressureThresholdUSec", c->pressure[PRESSURE_MEMORY].threshold_usec),
SD_JSON_BUILD_PAIR_STRING("CPUPressureWatch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_CPU].watch)),
JSON_BUILD_PAIR_FINITE_USEC("CPUPressureThresholdUSec", c->pressure[PRESSURE_CPU].threshold_usec),
SD_JSON_BUILD_PAIR_STRING("IOPressureWatch", cgroup_pressure_watch_to_string(c->pressure[PRESSURE_IO].watch)),
JSON_BUILD_PAIR_FINITE_USEC("IOPressureThresholdUSec", c->pressure[PRESSURE_IO].threshold_usec),
/* Others */ /* Others */
SD_JSON_BUILD_PAIR_BOOLEAN("CoredumpReceive", c->coredump_receive)); SD_JSON_BUILD_PAIR_BOOLEAN("CoredumpReceive", c->coredump_receive));

View File

@ -106,8 +106,12 @@ static int manager_context_build_json(sd_json_variant **ret, const char *name, v
SD_JSON_BUILD_PAIR_BOOLEAN("DefaultTasksAccounting", m->defaults.tasks_accounting), SD_JSON_BUILD_PAIR_BOOLEAN("DefaultTasksAccounting", m->defaults.tasks_accounting),
SD_JSON_BUILD_PAIR_CALLBACK("DefaultLimits", rlimit_table_build_json, m->defaults.rlimit), SD_JSON_BUILD_PAIR_CALLBACK("DefaultLimits", rlimit_table_build_json, m->defaults.rlimit),
SD_JSON_BUILD_PAIR_UNSIGNED("DefaultTasksMax", cgroup_tasks_max_resolve(&m->defaults.tasks_max)), SD_JSON_BUILD_PAIR_UNSIGNED("DefaultTasksMax", cgroup_tasks_max_resolve(&m->defaults.tasks_max)),
JSON_BUILD_PAIR_FINITE_USEC("DefaultMemoryPressureThresholdUSec", m->defaults.memory_pressure_threshold_usec), JSON_BUILD_PAIR_FINITE_USEC("DefaultMemoryPressureThresholdUSec", m->defaults.pressure[PRESSURE_MEMORY].threshold_usec),
SD_JSON_BUILD_PAIR_STRING("DefaultMemoryPressureWatch", cgroup_pressure_watch_to_string(m->defaults.memory_pressure_watch)), SD_JSON_BUILD_PAIR_STRING("DefaultMemoryPressureWatch", cgroup_pressure_watch_to_string(m->defaults.pressure[PRESSURE_MEMORY].watch)),
JSON_BUILD_PAIR_FINITE_USEC("DefaultCPUPressureThresholdUSec", m->defaults.pressure[PRESSURE_CPU].threshold_usec),
SD_JSON_BUILD_PAIR_STRING("DefaultCPUPressureWatch", cgroup_pressure_watch_to_string(m->defaults.pressure[PRESSURE_CPU].watch)),
JSON_BUILD_PAIR_FINITE_USEC("DefaultIOPressureThresholdUSec", m->defaults.pressure[PRESSURE_IO].threshold_usec),
SD_JSON_BUILD_PAIR_STRING("DefaultIOPressureWatch", cgroup_pressure_watch_to_string(m->defaults.pressure[PRESSURE_IO].watch)),
JSON_BUILD_PAIR_FINITE_USEC("RuntimeWatchdogUSec", manager_get_watchdog(m, WATCHDOG_RUNTIME)), JSON_BUILD_PAIR_FINITE_USEC("RuntimeWatchdogUSec", manager_get_watchdog(m, WATCHDOG_RUNTIME)),
JSON_BUILD_PAIR_FINITE_USEC("RebootWatchdogUSec", manager_get_watchdog(m, WATCHDOG_REBOOT)), JSON_BUILD_PAIR_FINITE_USEC("RebootWatchdogUSec", manager_get_watchdog(m, WATCHDOG_REBOOT)),
JSON_BUILD_PAIR_FINITE_USEC("KExecWatchdogUSec", manager_get_watchdog(m, WATCHDOG_KEXEC)), JSON_BUILD_PAIR_FINITE_USEC("KExecWatchdogUSec", manager_get_watchdog(m, WATCHDOG_KEXEC)),

View File

@ -373,7 +373,7 @@ static int save_external_coredump(
if (fd_compressed < 0) if (fd_compressed < 0)
return log_error_errno(fd_compressed, "Failed to create temporary file for coredump %s: %m", fn_compressed); return log_error_errno(fd_compressed, "Failed to create temporary file for coredump %s: %m", fn_compressed);
r = compress_stream(fd, fd_compressed, max_size, &uncompressed_size); r = compress_stream(DEFAULT_COMPRESSION, fd, fd_compressed, max_size, &uncompressed_size);
if (r < 0) if (r < 0)
return log_error_errno(r, "Failed to compress %s: %m", coredump_tmpfile_name(tmp_compressed)); return log_error_errno(r, "Failed to compress %s: %m", coredump_tmpfile_name(tmp_compressed));
@ -386,7 +386,7 @@ static int save_external_coredump(
tmp = unlink_and_free(tmp); tmp = unlink_and_free(tmp);
fd = safe_close(fd); fd = safe_close(fd);
r = compress_stream(context->input_fd, fd_compressed, max_size, &partial_uncompressed_size); r = compress_stream(DEFAULT_COMPRESSION, context->input_fd, fd_compressed, max_size, &partial_uncompressed_size);
if (r < 0) if (r < 0)
return log_error_errno(r, "Failed to compress %s: %m", coredump_tmpfile_name(tmp_compressed)); return log_error_errno(r, "Failed to compress %s: %m", coredump_tmpfile_name(tmp_compressed));
uncompressed_size += partial_uncompressed_size; uncompressed_size += partial_uncompressed_size;

View File

@ -1103,7 +1103,7 @@ static int save_core(sd_journal *j, FILE *file, char **path, bool *unlink_temp)
goto error; goto error;
} }
r = decompress_stream(filename, fdf, fd, -1); r = decompress_stream_by_filename(filename, fdf, fd, -1);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to decompress %s: %m", filename); log_error_errno(r, "Failed to decompress %s: %m", filename);
goto error; goto error;

View File

@ -227,6 +227,12 @@ assert_cc(sizeof(long long) == sizeof(intmax_t));
MAX(_d, a); \ MAX(_d, a); \
}) })
#define MAX5(x, y, z, a, b) \
({ \
const typeof(x) _e = MAX4(x, y, z, a); \
MAX(_e, b); \
})
#undef MIN #undef MIN
#define MIN(a, b) __MIN(UNIQ, (a), UNIQ, (b)) #define MIN(a, b) __MIN(UNIQ, (a), UNIQ, (b))
#define __MIN(aq, a, bq, b) \ #define __MIN(aq, a, bq, b) \

View File

@ -11,7 +11,6 @@
#include "fd-util.h" #include "fd-util.h"
#include "format-util.h" #include "format-util.h"
#include "fs-util.h" #include "fs-util.h"
#include "import-common.h"
#include "log.h" #include "log.h"
#include "pretty-print.h" #include "pretty-print.h"
#include "ratelimit.h" #include "ratelimit.h"
@ -32,7 +31,7 @@ typedef struct RawExport {
int input_fd; int input_fd;
int output_fd; int output_fd;
ImportCompress compress; Compressor *compress;
sd_event_source *output_event_source; sd_event_source *output_event_source;
@ -59,7 +58,7 @@ RawExport *raw_export_unref(RawExport *e) {
sd_event_source_unref(e->output_event_source); sd_event_source_unref(e->output_event_source);
import_compress_free(&e->compress); e->compress = compressor_free(e->compress);
sd_event_unref(e->event); sd_event_unref(e->event);
@ -143,7 +142,7 @@ static int raw_export_process(RawExport *e) {
assert(e); assert(e);
if (!e->tried_reflink && e->compress.type == IMPORT_COMPRESS_UNCOMPRESSED) { if (!e->tried_reflink && compressor_type(e->compress) == COMPRESSION_NONE) {
/* If we shall take an uncompressed snapshot we can /* If we shall take an uncompressed snapshot we can
* reflink source to destination directly. Let's see * reflink source to destination directly. Let's see
@ -158,9 +157,9 @@ static int raw_export_process(RawExport *e) {
e->tried_reflink = true; e->tried_reflink = true;
} }
if (!e->tried_sendfile && e->compress.type == IMPORT_COMPRESS_UNCOMPRESSED) { if (!e->tried_sendfile && compressor_type(e->compress) == COMPRESSION_NONE) {
l = sendfile(e->output_fd, e->input_fd, NULL, IMPORT_BUFFER_SIZE); l = sendfile(e->output_fd, e->input_fd, NULL, COMPRESS_PIPE_BUFFER_SIZE);
if (l < 0) { if (l < 0) {
if (errno == EAGAIN) if (errno == EAGAIN)
return 0; return 0;
@ -180,7 +179,7 @@ static int raw_export_process(RawExport *e) {
} }
while (e->buffer_size <= 0) { while (e->buffer_size <= 0) {
uint8_t input[IMPORT_BUFFER_SIZE]; uint8_t input[COMPRESS_PIPE_BUFFER_SIZE];
if (e->eof) { if (e->eof) {
r = 0; r = 0;
@ -195,10 +194,10 @@ static int raw_export_process(RawExport *e) {
if (l == 0) { if (l == 0) {
e->eof = true; e->eof = true;
r = import_compress_finish(&e->compress, &e->buffer, &e->buffer_size, &e->buffer_allocated); r = compressor_finish(e->compress, &e->buffer, &e->buffer_size, &e->buffer_allocated);
} else { } else {
e->written_uncompressed += l; e->written_uncompressed += l;
r = import_compress(&e->compress, input, l, &e->buffer, &e->buffer_size, &e->buffer_allocated); r = compressor_start(e->compress, input, l, &e->buffer, &e->buffer_size, &e->buffer_allocated);
} }
if (r < 0) { if (r < 0) {
r = log_error_errno(r, "Failed to encode: %m"); r = log_error_errno(r, "Failed to encode: %m");
@ -280,15 +279,15 @@ static int reflink_snapshot(int fd, const char *path) {
return new_fd; return new_fd;
} }
int raw_export_start(RawExport *e, const char *path, int fd, ImportCompressType compress) { int raw_export_start(RawExport *e, const char *path, int fd, Compression compress) {
_cleanup_close_ int sfd = -EBADF, tfd = -EBADF; _cleanup_close_ int sfd = -EBADF, tfd = -EBADF;
int r; int r;
assert(e); assert(e);
assert(path); assert(path);
assert(fd >= 0); assert(fd >= 0);
assert(compress < _IMPORT_COMPRESS_TYPE_MAX); assert(compress >= 0);
assert(compress != IMPORT_COMPRESS_UNKNOWN); assert(compress < _COMPRESSION_MAX);
if (e->output_fd >= 0) if (e->output_fd >= 0)
return -EBUSY; return -EBUSY;
@ -318,7 +317,7 @@ int raw_export_start(RawExport *e, const char *path, int fd, ImportCompressType
else else
e->input_fd = TAKE_FD(sfd); e->input_fd = TAKE_FD(sfd);
r = import_compress_init(&e->compress, compress); r = compressor_new(&e->compress, compress);
if (r < 0) if (r < 0)
return r; return r;

View File

@ -1,8 +1,8 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#pragma once #pragma once
#include "compress.h"
#include "shared-forward.h" #include "shared-forward.h"
#include "import-compress.h"
typedef struct RawExport RawExport; typedef struct RawExport RawExport;
@ -13,4 +13,4 @@ RawExport* raw_export_unref(RawExport *e);
DEFINE_TRIVIAL_CLEANUP_FUNC(RawExport*, raw_export_unref); DEFINE_TRIVIAL_CLEANUP_FUNC(RawExport*, raw_export_unref);
int raw_export_start(RawExport *e, const char *path, int fd, ImportCompressType compress); int raw_export_start(RawExport *e, const char *path, int fd, Compression compress);

View File

@ -1,6 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <sys/stat.h> #include <sys/stat.h>
#include <unistd.h>
#include "sd-daemon.h" #include "sd-daemon.h"
#include "sd-event.h" #include "sd-event.h"
@ -38,7 +39,7 @@ typedef struct TarExport {
int tree_fd; /* directory fd of the tree to set up */ int tree_fd; /* directory fd of the tree to set up */
int userns_fd; int userns_fd;
ImportCompress compress; Compressor *compress;
sd_event_source *output_event_source; sd_event_source *output_event_source;
@ -74,7 +75,7 @@ TarExport *tar_export_unref(TarExport *e) {
free(e->temp_path); free(e->temp_path);
} }
import_compress_free(&e->compress); e->compress = compressor_free(e->compress);
sd_event_unref(e->event); sd_event_unref(e->event);
@ -188,9 +189,9 @@ static int tar_export_process(TarExport *e) {
assert(e); assert(e);
if (!e->tried_splice && e->compress.type == IMPORT_COMPRESS_UNCOMPRESSED) { if (!e->tried_splice && compressor_type(e->compress) == COMPRESSION_NONE) {
l = splice(e->tar_fd, NULL, e->output_fd, NULL, IMPORT_BUFFER_SIZE, 0); l = splice(e->tar_fd, NULL, e->output_fd, NULL, COMPRESS_PIPE_BUFFER_SIZE, 0);
if (l < 0) { if (l < 0) {
if (errno == EAGAIN) if (errno == EAGAIN)
return 0; return 0;
@ -210,7 +211,7 @@ static int tar_export_process(TarExport *e) {
} }
while (e->buffer_size <= 0) { while (e->buffer_size <= 0) {
uint8_t input[IMPORT_BUFFER_SIZE]; uint8_t input[COMPRESS_PIPE_BUFFER_SIZE];
if (e->eof) { if (e->eof) {
r = tar_export_finish(e); r = tar_export_finish(e);
@ -225,10 +226,10 @@ static int tar_export_process(TarExport *e) {
if (l == 0) { if (l == 0) {
e->eof = true; e->eof = true;
r = import_compress_finish(&e->compress, &e->buffer, &e->buffer_size, &e->buffer_allocated); r = compressor_finish(e->compress, &e->buffer, &e->buffer_size, &e->buffer_allocated);
} else { } else {
e->written_uncompressed += l; e->written_uncompressed += l;
r = import_compress(&e->compress, input, l, &e->buffer, &e->buffer_size, &e->buffer_allocated); r = compressor_start(e->compress, input, l, &e->buffer, &e->buffer_size, &e->buffer_allocated);
} }
if (r < 0) { if (r < 0) {
r = log_error_errno(r, "Failed to encode: %m"); r = log_error_errno(r, "Failed to encode: %m");
@ -282,7 +283,7 @@ int tar_export_start(
TarExport *e, TarExport *e,
const char *path, const char *path,
int fd, int fd,
ImportCompressType compress, Compression compress,
ImportFlags flags) { ImportFlags flags) {
_cleanup_close_ int sfd = -EBADF; _cleanup_close_ int sfd = -EBADF;
@ -291,8 +292,8 @@ int tar_export_start(
assert(e); assert(e);
assert(path); assert(path);
assert(fd >= 0); assert(fd >= 0);
assert(compress < _IMPORT_COMPRESS_TYPE_MAX); assert(compress >= 0);
assert(compress != IMPORT_COMPRESS_UNKNOWN); assert(compress < _COMPRESSION_MAX);
if (e->output_fd >= 0) if (e->output_fd >= 0)
return -EBUSY; return -EBUSY;
@ -336,7 +337,7 @@ int tar_export_start(
} }
} }
r = import_compress_init(&e->compress, compress); r = compressor_new(&e->compress, compress);
if (r < 0) if (r < 0)
return r; return r;

View File

@ -1,8 +1,8 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#pragma once #pragma once
#include "compress.h"
#include "import-common.h" #include "import-common.h"
#include "import-compress.h"
#include "shared-forward.h" #include "shared-forward.h"
typedef struct TarExport TarExport; typedef struct TarExport TarExport;
@ -14,4 +14,4 @@ TarExport* tar_export_unref(TarExport *e);
DEFINE_TRIVIAL_CLEANUP_FUNC(TarExport*, tar_export_unref); DEFINE_TRIVIAL_CLEANUP_FUNC(TarExport*, tar_export_unref);
int tar_export_start(TarExport *e, const char *path, int fd, ImportCompressType compress, ImportFlags flags); int tar_export_start(TarExport *e, const char *path, int fd, Compression compress, ImportFlags flags);

View File

@ -2,6 +2,7 @@
#include <getopt.h> #include <getopt.h>
#include <locale.h> #include <locale.h>
#include <unistd.h>
#include "sd-event.h" #include "sd-event.h"
@ -22,30 +23,15 @@
#include "verbs.h" #include "verbs.h"
static ImportFlags arg_import_flags = 0; static ImportFlags arg_import_flags = 0;
static ImportCompressType arg_compress = IMPORT_COMPRESS_UNKNOWN; static Compression arg_compress = _COMPRESSION_INVALID;
static ImageClass arg_class = IMAGE_MACHINE; static ImageClass arg_class = IMAGE_MACHINE;
static RuntimeScope arg_runtime_scope = _RUNTIME_SCOPE_INVALID; static RuntimeScope arg_runtime_scope = _RUNTIME_SCOPE_INVALID;
static void determine_compression_from_filename(const char *p) { static void determine_compression_from_filename(const char *p) {
if (arg_compress >= 0)
if (arg_compress != IMPORT_COMPRESS_UNKNOWN)
return; return;
if (!p) { arg_compress = p ? compression_from_filename(p) : COMPRESSION_NONE;
arg_compress = IMPORT_COMPRESS_UNCOMPRESSED;
return;
}
if (endswith(p, ".xz"))
arg_compress = IMPORT_COMPRESS_XZ;
else if (endswith(p, ".gz"))
arg_compress = IMPORT_COMPRESS_GZIP;
else if (endswith(p, ".bz2"))
arg_compress = IMPORT_COMPRESS_BZIP2;
else if (endswith(p, ".zst"))
arg_compress = IMPORT_COMPRESS_ZSTD;
else
arg_compress = IMPORT_COMPRESS_UNCOMPRESSED;
} }
static void on_tar_finished(TarExport *export, int error, void *userdata) { static void on_tar_finished(TarExport *export, int error, void *userdata) {
@ -91,7 +77,7 @@ static int verb_export_tar(int argc, char *argv[], uintptr_t _data, void *userda
fd = open_fd; fd = open_fd;
log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, path, import_compress_type_to_string(arg_compress)); log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, path, compression_to_string(arg_compress));
} else { } else {
_cleanup_free_ char *pretty = NULL; _cleanup_free_ char *pretty = NULL;
@ -101,7 +87,7 @@ static int verb_export_tar(int argc, char *argv[], uintptr_t _data, void *userda
fd = STDOUT_FILENO; fd = STDOUT_FILENO;
(void) fd_get_path(fd, &pretty); (void) fd_get_path(fd, &pretty);
log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, strna(pretty), import_compress_type_to_string(arg_compress)); log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, strna(pretty), compression_to_string(arg_compress));
} }
r = import_allocate_event_with_signals(&event); r = import_allocate_event_with_signals(&event);
@ -172,14 +158,14 @@ static int verb_export_raw(int argc, char *argv[], uintptr_t _data, void *userda
fd = open_fd; fd = open_fd;
log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, path, import_compress_type_to_string(arg_compress)); log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, path, compression_to_string(arg_compress));
} else { } else {
_cleanup_free_ char *pretty = NULL; _cleanup_free_ char *pretty = NULL;
fd = STDOUT_FILENO; fd = STDOUT_FILENO;
(void) fd_get_path(fd, &pretty); (void) fd_get_path(fd, &pretty);
log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, strna(pretty), import_compress_type_to_string(arg_compress)); log_info("Exporting '%s', saving to '%s' with compression '%s'.", local, strna(pretty), compression_to_string(arg_compress));
} }
r = import_allocate_event_with_signals(&event); r = import_allocate_event_with_signals(&event);
@ -265,8 +251,8 @@ static int parse_argv(int argc, char *argv[]) {
return version(); return version();
case ARG_FORMAT: case ARG_FORMAT:
arg_compress = import_compress_type_from_string(optarg); arg_compress = compression_from_string_harder(optarg);
if (arg_compress < 0 || arg_compress == IMPORT_COMPRESS_UNKNOWN) if (arg_compress < 0)
return log_error_errno(SYNTHETIC_ERRNO(EINVAL), return log_error_errno(SYNTHETIC_ERRNO(EINVAL),
"Unknown format: %s", optarg); "Unknown format: %s", optarg);
break; break;

View File

@ -7,6 +7,7 @@
#include "sd-event.h" #include "sd-event.h"
#include "capability-util.h" #include "capability-util.h"
#include "compress.h"
#include "dirent-util.h" #include "dirent-util.h"
#include "dissect-image.h" #include "dissect-image.h"
#include "fd-util.h" #include "fd-util.h"
@ -43,7 +44,7 @@ int import_fork_tar_x(int tree_fd, int userns_fd, PidRef *ret_pid) {
if (pipe2(pipefd, O_CLOEXEC) < 0) if (pipe2(pipefd, O_CLOEXEC) < 0)
return log_error_errno(errno, "Failed to create pipe for tar: %m"); return log_error_errno(errno, "Failed to create pipe for tar: %m");
(void) fcntl(pipefd[0], F_SETPIPE_SZ, IMPORT_BUFFER_SIZE); (void) fcntl(pipefd[0], F_SETPIPE_SZ, COMPRESS_PIPE_BUFFER_SIZE);
r = pidref_safe_fork_full( r = pidref_safe_fork_full(
"tar-x", "tar-x",
@ -110,7 +111,7 @@ int import_fork_tar_c(int tree_fd, int userns_fd, PidRef *ret_pid) {
if (pipe2(pipefd, O_CLOEXEC) < 0) if (pipe2(pipefd, O_CLOEXEC) < 0)
return log_error_errno(errno, "Failed to create pipe for tar: %m"); return log_error_errno(errno, "Failed to create pipe for tar: %m");
(void) fcntl(pipefd[0], F_SETPIPE_SZ, IMPORT_BUFFER_SIZE); (void) fcntl(pipefd[0], F_SETPIPE_SZ, COMPRESS_PIPE_BUFFER_SIZE);
r = pidref_safe_fork_full( r = pidref_safe_fork_full(
"tar-c", "tar-c",

View File

@ -49,5 +49,3 @@ int import_allocate_event_with_signals(sd_event **ret);
int import_make_foreign_userns(int *userns_fd); int import_make_foreign_userns(int *userns_fd);
int import_remove_tree(const char *path, int *userns_fd, ImportFlags flags); int import_remove_tree(const char *path, int *userns_fd, ImportFlags flags);
#define IMPORT_BUFFER_SIZE (128U*1024U)

View File

@ -1,611 +0,0 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <stdlib.h>
#include <string.h>
#include "import-common.h"
#include "import-compress.h"
#include "log.h"
#include "string-table.h"
void import_compress_free(ImportCompress *c) {
assert(c);
if (c->type == IMPORT_COMPRESS_XZ)
lzma_end(&c->xz);
else if (c->type == IMPORT_COMPRESS_GZIP) {
if (c->encoding)
deflateEnd(&c->gzip);
else
inflateEnd(&c->gzip);
#if HAVE_BZIP2
} else if (c->type == IMPORT_COMPRESS_BZIP2) {
if (c->encoding)
BZ2_bzCompressEnd(&c->bzip2);
else
BZ2_bzDecompressEnd(&c->bzip2);
#endif
#if HAVE_ZSTD
} else if (c->type == IMPORT_COMPRESS_ZSTD) {
if (c->encoding) {
ZSTD_freeCCtx(c->c_zstd);
c->c_zstd = NULL;
} else {
ZSTD_freeDCtx(c->d_zstd);
c->d_zstd = NULL;
}
#endif
}
c->type = IMPORT_COMPRESS_UNKNOWN;
}
int import_uncompress_detect(ImportCompress *c, const void *data, size_t size) {
static const uint8_t xz_signature[] = {
0xfd, '7', 'z', 'X', 'Z', 0x00
};
static const uint8_t gzip_signature[] = {
0x1f, 0x8b
};
static const uint8_t bzip2_signature[] = {
'B', 'Z', 'h'
};
static const uint8_t zstd_signature[] = {
0x28, 0xb5, 0x2f, 0xfd
};
int r;
assert(c);
if (c->type != IMPORT_COMPRESS_UNKNOWN)
return 1;
if (size < MAX4(sizeof(xz_signature),
sizeof(gzip_signature),
sizeof(zstd_signature),
sizeof(bzip2_signature)))
return 0;
assert(data);
if (memcmp(data, xz_signature, sizeof(xz_signature)) == 0) {
lzma_ret xzr;
xzr = lzma_stream_decoder(&c->xz, UINT64_MAX, LZMA_TELL_UNSUPPORTED_CHECK | LZMA_CONCATENATED);
if (xzr != LZMA_OK)
return -EIO;
c->type = IMPORT_COMPRESS_XZ;
} else if (memcmp(data, gzip_signature, sizeof(gzip_signature)) == 0) {
r = inflateInit2(&c->gzip, 15+16);
if (r != Z_OK)
return -EIO;
c->type = IMPORT_COMPRESS_GZIP;
#if HAVE_BZIP2
} else if (memcmp(data, bzip2_signature, sizeof(bzip2_signature)) == 0) {
r = BZ2_bzDecompressInit(&c->bzip2, 0, 0);
if (r != BZ_OK)
return -EIO;
c->type = IMPORT_COMPRESS_BZIP2;
#endif
#if HAVE_ZSTD
} else if (memcmp(data, zstd_signature, sizeof(zstd_signature)) == 0) {
c->d_zstd = ZSTD_createDCtx();
if (!c->d_zstd)
return -ENOMEM;
c->type = IMPORT_COMPRESS_ZSTD;
#endif
} else
c->type = IMPORT_COMPRESS_UNCOMPRESSED;
c->encoding = false;
log_debug("Detected compression type: %s", import_compress_type_to_string(c->type));
return 1;
}
void import_uncompress_force_off(ImportCompress *c) {
assert(c);
c->type = IMPORT_COMPRESS_UNCOMPRESSED;
c->encoding = false;
}
int import_uncompress(ImportCompress *c, const void *data, size_t size, ImportCompressCallback callback, void *userdata) {
int r;
assert(c);
assert(callback);
r = import_uncompress_detect(c, data, size);
if (r <= 0)
return r;
if (c->encoding)
return -EINVAL;
if (size <= 0)
return 1;
assert(data);
switch (c->type) {
case IMPORT_COMPRESS_UNCOMPRESSED:
r = callback(data, size, userdata);
if (r < 0)
return r;
break;
case IMPORT_COMPRESS_XZ:
c->xz.next_in = data;
c->xz.avail_in = size;
while (c->xz.avail_in > 0) {
uint8_t buffer[IMPORT_BUFFER_SIZE];
lzma_ret lzr;
c->xz.next_out = buffer;
c->xz.avail_out = sizeof(buffer);
lzr = lzma_code(&c->xz, LZMA_RUN);
if (!IN_SET(lzr, LZMA_OK, LZMA_STREAM_END))
return -EIO;
if (c->xz.avail_out < sizeof(buffer)) {
r = callback(buffer, sizeof(buffer) - c->xz.avail_out, userdata);
if (r < 0)
return r;
}
}
break;
case IMPORT_COMPRESS_GZIP:
c->gzip.next_in = (void*) data;
c->gzip.avail_in = size;
while (c->gzip.avail_in > 0) {
uint8_t buffer[IMPORT_BUFFER_SIZE];
c->gzip.next_out = buffer;
c->gzip.avail_out = sizeof(buffer);
r = inflate(&c->gzip, Z_NO_FLUSH);
if (!IN_SET(r, Z_OK, Z_STREAM_END))
return -EIO;
if (c->gzip.avail_out < sizeof(buffer)) {
r = callback(buffer, sizeof(buffer) - c->gzip.avail_out, userdata);
if (r < 0)
return r;
}
}
break;
#if HAVE_BZIP2
case IMPORT_COMPRESS_BZIP2:
c->bzip2.next_in = (void*) data;
c->bzip2.avail_in = size;
while (c->bzip2.avail_in > 0) {
uint8_t buffer[IMPORT_BUFFER_SIZE];
c->bzip2.next_out = (char*) buffer;
c->bzip2.avail_out = sizeof(buffer);
r = BZ2_bzDecompress(&c->bzip2);
if (!IN_SET(r, BZ_OK, BZ_STREAM_END))
return -EIO;
if (c->bzip2.avail_out < sizeof(buffer)) {
r = callback(buffer, sizeof(buffer) - c->bzip2.avail_out, userdata);
if (r < 0)
return r;
}
}
break;
#endif
#if HAVE_ZSTD
case IMPORT_COMPRESS_ZSTD: {
ZSTD_inBuffer input = {
.src = (void*) data,
.size = size,
};
while (input.pos < input.size) {
uint8_t buffer[IMPORT_BUFFER_SIZE];
ZSTD_outBuffer output = {
.dst = buffer,
.size = sizeof(buffer),
};
size_t res;
res = ZSTD_decompressStream(c->d_zstd, &output, &input);
if (ZSTD_isError(res))
return -EIO;
if (output.pos > 0) {
r = callback(output.dst, output.pos, userdata);
if (r < 0)
return r;
}
}
break;
}
#endif
default:
assert_not_reached();
}
return 1;
}
int import_compress_init(ImportCompress *c, ImportCompressType t) {
int r;
assert(c);
switch (t) {
case IMPORT_COMPRESS_XZ: {
lzma_ret xzr;
xzr = lzma_easy_encoder(&c->xz, LZMA_PRESET_DEFAULT, LZMA_CHECK_CRC64);
if (xzr != LZMA_OK)
return -EIO;
c->type = IMPORT_COMPRESS_XZ;
break;
}
case IMPORT_COMPRESS_GZIP:
r = deflateInit2(&c->gzip, Z_DEFAULT_COMPRESSION, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY);
if (r != Z_OK)
return -EIO;
c->type = IMPORT_COMPRESS_GZIP;
break;
#if HAVE_BZIP2
case IMPORT_COMPRESS_BZIP2:
r = BZ2_bzCompressInit(&c->bzip2, 9, 0, 0);
if (r != BZ_OK)
return -EIO;
c->type = IMPORT_COMPRESS_BZIP2;
break;
#endif
#if HAVE_ZSTD
case IMPORT_COMPRESS_ZSTD:
c->c_zstd = ZSTD_createCCtx();
if (!c->c_zstd)
return -ENOMEM;
r = ZSTD_CCtx_setParameter(c->c_zstd, ZSTD_c_compressionLevel, ZSTD_CLEVEL_DEFAULT);
if (ZSTD_isError(r))
return -EIO;
c->type = IMPORT_COMPRESS_ZSTD;
break;
#endif
case IMPORT_COMPRESS_UNCOMPRESSED:
c->type = IMPORT_COMPRESS_UNCOMPRESSED;
break;
default:
return -EOPNOTSUPP;
}
c->encoding = true;
return 0;
}
static int enlarge_buffer(void **buffer, size_t *buffer_size, size_t *buffer_allocated) {
size_t l;
void *p;
assert(buffer);
assert(buffer_size);
assert(buffer_allocated);
if (*buffer_allocated > *buffer_size)
return 0;
l = MAX(IMPORT_BUFFER_SIZE, (*buffer_size * 2));
p = realloc(*buffer, l);
if (!p)
return -ENOMEM;
*buffer = p;
*buffer_allocated = l;
return 1;
}
int import_compress(ImportCompress *c, const void *data, size_t size, void **buffer, size_t *buffer_size, size_t *buffer_allocated) {
int r;
assert(c);
assert(buffer);
assert(buffer_size);
assert(buffer_allocated);
if (!c->encoding)
return -EINVAL;
if (size <= 0)
return 0;
assert(data);
*buffer_size = 0;
switch (c->type) {
case IMPORT_COMPRESS_XZ:
c->xz.next_in = data;
c->xz.avail_in = size;
while (c->xz.avail_in > 0) {
lzma_ret lzr;
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->xz.next_out = (uint8_t*) *buffer + *buffer_size;
c->xz.avail_out = *buffer_allocated - *buffer_size;
lzr = lzma_code(&c->xz, LZMA_RUN);
if (lzr != LZMA_OK)
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->xz.avail_out;
}
break;
case IMPORT_COMPRESS_GZIP:
c->gzip.next_in = (void*) data;
c->gzip.avail_in = size;
while (c->gzip.avail_in > 0) {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->gzip.next_out = (uint8_t*) *buffer + *buffer_size;
c->gzip.avail_out = *buffer_allocated - *buffer_size;
r = deflate(&c->gzip, Z_NO_FLUSH);
if (r != Z_OK)
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->gzip.avail_out;
}
break;
#if HAVE_BZIP2
case IMPORT_COMPRESS_BZIP2:
c->bzip2.next_in = (void*) data;
c->bzip2.avail_in = size;
while (c->bzip2.avail_in > 0) {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->bzip2.next_out = (void*) ((uint8_t*) *buffer + *buffer_size);
c->bzip2.avail_out = *buffer_allocated - *buffer_size;
r = BZ2_bzCompress(&c->bzip2, BZ_RUN);
if (r != BZ_RUN_OK)
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->bzip2.avail_out;
}
break;
#endif
#if HAVE_ZSTD
case IMPORT_COMPRESS_ZSTD: {
ZSTD_inBuffer input = {
.src = data,
.size = size,
};
while (input.pos < input.size) {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
ZSTD_outBuffer output = {
.dst = ((uint8_t *) *buffer + *buffer_size),
.size = *buffer_allocated - *buffer_size,
};
size_t res;
res = ZSTD_compressStream2(c->c_zstd, &output, &input, ZSTD_e_continue);
if (ZSTD_isError(res))
return -EIO;
*buffer_size += output.pos;
}
break;
}
#endif
case IMPORT_COMPRESS_UNCOMPRESSED:
if (*buffer_allocated < size) {
void *p;
p = realloc(*buffer, size);
if (!p)
return -ENOMEM;
*buffer = p;
*buffer_allocated = size;
}
memcpy(*buffer, data, size);
*buffer_size = size;
break;
default:
return -EOPNOTSUPP;
}
return 0;
}
int import_compress_finish(ImportCompress *c, void **buffer, size_t *buffer_size, size_t *buffer_allocated) {
int r;
assert(c);
assert(buffer);
assert(buffer_size);
assert(buffer_allocated);
if (!c->encoding)
return -EINVAL;
*buffer_size = 0;
switch (c->type) {
case IMPORT_COMPRESS_XZ: {
lzma_ret lzr;
c->xz.avail_in = 0;
do {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->xz.next_out = (uint8_t*) *buffer + *buffer_size;
c->xz.avail_out = *buffer_allocated - *buffer_size;
lzr = lzma_code(&c->xz, LZMA_FINISH);
if (!IN_SET(lzr, LZMA_OK, LZMA_STREAM_END))
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->xz.avail_out;
} while (lzr != LZMA_STREAM_END);
break;
}
case IMPORT_COMPRESS_GZIP:
c->gzip.avail_in = 0;
do {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->gzip.next_out = (uint8_t*) *buffer + *buffer_size;
c->gzip.avail_out = *buffer_allocated - *buffer_size;
r = deflate(&c->gzip, Z_FINISH);
if (!IN_SET(r, Z_OK, Z_STREAM_END))
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->gzip.avail_out;
} while (r != Z_STREAM_END);
break;
#if HAVE_BZIP2
case IMPORT_COMPRESS_BZIP2:
c->bzip2.avail_in = 0;
do {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
c->bzip2.next_out = (void*) ((uint8_t*) *buffer + *buffer_size);
c->bzip2.avail_out = *buffer_allocated - *buffer_size;
r = BZ2_bzCompress(&c->bzip2, BZ_FINISH);
if (!IN_SET(r, BZ_FINISH_OK, BZ_STREAM_END))
return -EIO;
*buffer_size += (*buffer_allocated - *buffer_size) - c->bzip2.avail_out;
} while (r != BZ_STREAM_END);
break;
#endif
#if HAVE_ZSTD
case IMPORT_COMPRESS_ZSTD: {
ZSTD_inBuffer input = {};
size_t res;
do {
r = enlarge_buffer(buffer, buffer_size, buffer_allocated);
if (r < 0)
return r;
ZSTD_outBuffer output = {
.dst = ((uint8_t *) *buffer + *buffer_size),
.size = *buffer_allocated - *buffer_size,
};
res = ZSTD_compressStream2(c->c_zstd, &output, &input, ZSTD_e_end);
if (ZSTD_isError(res))
return -EIO;
*buffer_size += output.pos;
} while (res != 0);
break;
}
#endif
case IMPORT_COMPRESS_UNCOMPRESSED:
break;
default:
return -EOPNOTSUPP;
}
return 0;
}
static const char* const import_compress_type_table[_IMPORT_COMPRESS_TYPE_MAX] = {
[IMPORT_COMPRESS_UNKNOWN] = "unknown",
[IMPORT_COMPRESS_UNCOMPRESSED] = "uncompressed",
[IMPORT_COMPRESS_XZ] = "xz",
[IMPORT_COMPRESS_GZIP] = "gzip",
#if HAVE_BZIP2
[IMPORT_COMPRESS_BZIP2] = "bzip2",
#endif
#if HAVE_ZSTD
[IMPORT_COMPRESS_ZSTD] = "zstd",
#endif
};
DEFINE_STRING_TABLE_LOOKUP(import_compress_type, ImportCompressType);

View File

@ -1,54 +0,0 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */
#pragma once
#if HAVE_BZIP2
#include <bzlib.h>
#endif
#include <lzma.h>
#include <zlib.h>
#if HAVE_ZSTD
#include <zstd.h>
#endif
#include "shared-forward.h"
typedef enum ImportCompressType {
IMPORT_COMPRESS_UNKNOWN,
IMPORT_COMPRESS_UNCOMPRESSED,
IMPORT_COMPRESS_XZ,
IMPORT_COMPRESS_GZIP,
IMPORT_COMPRESS_BZIP2,
IMPORT_COMPRESS_ZSTD,
_IMPORT_COMPRESS_TYPE_MAX,
_IMPORT_COMPRESS_TYPE_INVALID = -EINVAL,
} ImportCompressType;
typedef struct ImportCompress {
ImportCompressType type;
bool encoding;
union {
lzma_stream xz;
z_stream gzip;
#if HAVE_BZIP2
bz_stream bzip2;
#endif
#if HAVE_ZSTD
ZSTD_CCtx *c_zstd;
ZSTD_DCtx *d_zstd;
#endif
};
} ImportCompress;
typedef int (*ImportCompressCallback)(const void *data, size_t size, void *userdata);
void import_compress_free(ImportCompress *c);
int import_uncompress_detect(ImportCompress *c, const void *data, size_t size);
void import_uncompress_force_off(ImportCompress *c);
int import_uncompress(ImportCompress *c, const void *data, size_t size, ImportCompressCallback callback, void *userdata);
int import_compress_init(ImportCompress *c, ImportCompressType t);
int import_compress(ImportCompress *c, const void *data, size_t size, void **buffer, size_t *buffer_size, size_t *buffer_allocated);
int import_compress_finish(ImportCompress *c, void **buffer, size_t *buffer_size, size_t *buffer_allocated);
DECLARE_STRING_TABLE_LOOKUP(import_compress_type, ImportCompressType);

View File

@ -1,17 +1,18 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <sys/stat.h> #include <sys/stat.h>
#include <unistd.h>
#include "sd-daemon.h" #include "sd-daemon.h"
#include "sd-event.h" #include "sd-event.h"
#include "alloc-util.h" #include "alloc-util.h"
#include "compress.h"
#include "copy.h" #include "copy.h"
#include "fd-util.h" #include "fd-util.h"
#include "format-util.h" #include "format-util.h"
#include "fs-util.h" #include "fs-util.h"
#include "import-common.h" #include "import-common.h"
#include "import-compress.h"
#include "import-raw.h" #include "import-raw.h"
#include "import-util.h" #include "import-util.h"
#include "install-file.h" #include "install-file.h"
@ -43,11 +44,11 @@ typedef struct RawImport {
int input_fd; int input_fd;
int output_fd; int output_fd;
ImportCompress compress; Compressor *compress;
sd_event_source *input_event_source; sd_event_source *input_event_source;
uint8_t buffer[IMPORT_BUFFER_SIZE]; uint8_t buffer[COMPRESS_PIPE_BUFFER_SIZE];
size_t buffer_size; size_t buffer_size;
uint64_t written_compressed; uint64_t written_compressed;
@ -71,7 +72,7 @@ RawImport* raw_import_unref(RawImport *i) {
unlink_and_free(i->temp_path); unlink_and_free(i->temp_path);
import_compress_free(&i->compress); i->compress = compressor_free(i->compress);
sd_event_unref(i->event); sd_event_unref(i->event);
@ -328,7 +329,7 @@ static int raw_import_try_reflink(RawImport *i) {
assert(i->input_fd >= 0); assert(i->input_fd >= 0);
assert(i->output_fd >= 0); assert(i->output_fd >= 0);
if (i->compress.type != IMPORT_COMPRESS_UNCOMPRESSED) if (compressor_type(i->compress) != COMPRESSION_NONE)
return 0; return 0;
if (i->offset != UINT64_MAX || i->size_max != UINT64_MAX) if (i->offset != UINT64_MAX || i->size_max != UINT64_MAX)
@ -425,13 +426,13 @@ static int raw_import_process(RawImport *i) {
i->buffer_size += l; i->buffer_size += l;
if (i->compress.type == IMPORT_COMPRESS_UNKNOWN) { if (!i->compress) {
if (l == 0) { /* EOF */ if (l == 0) { /* EOF */
log_debug("File too short to be compressed, as no compression signature fits in, thus assuming uncompressed."); log_debug("File too short to be compressed, as no compression signature fits in, thus assuming uncompressed.");
import_uncompress_force_off(&i->compress); decompressor_force_off(&i->compress);
} else { } else {
r = import_uncompress_detect(&i->compress, i->buffer, i->buffer_size); r = decompressor_detect(&i->compress, i->buffer, i->buffer_size);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to detect file compression: %m"); log_error_errno(r, "Failed to detect file compression: %m");
goto finish; goto finish;
@ -451,7 +452,7 @@ static int raw_import_process(RawImport *i) {
goto complete; goto complete;
} }
r = import_uncompress(&i->compress, i->buffer, i->buffer_size, raw_import_write, i); r = decompressor_push(i->compress, i->buffer, i->buffer_size, raw_import_write, i);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to decode and write: %m"); log_error_errno(r, "Failed to decode and write: %m");
goto finish; goto finish;

View File

@ -1,6 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <sys/stat.h> #include <sys/stat.h>
#include <unistd.h>
#include "sd-daemon.h" #include "sd-daemon.h"
#include "sd-event.h" #include "sd-event.h"
@ -8,12 +9,12 @@
#include "alloc-util.h" #include "alloc-util.h"
#include "btrfs-util.h" #include "btrfs-util.h"
#include "compress.h"
#include "dissect-image.h" #include "dissect-image.h"
#include "errno-util.h" #include "errno-util.h"
#include "fd-util.h" #include "fd-util.h"
#include "format-util.h" #include "format-util.h"
#include "import-common.h" #include "import-common.h"
#include "import-compress.h"
#include "import-tar.h" #include "import-tar.h"
#include "import-util.h" #include "import-util.h"
#include "install-file.h" #include "install-file.h"
@ -50,11 +51,11 @@ typedef struct TarImport {
int tree_fd; int tree_fd;
int userns_fd; int userns_fd;
ImportCompress compress; Compressor *compress;
sd_event_source *input_event_source; sd_event_source *input_event_source;
uint8_t buffer[IMPORT_BUFFER_SIZE]; uint8_t buffer[COMPRESS_PIPE_BUFFER_SIZE];
size_t buffer_size; size_t buffer_size;
uint64_t written_compressed; uint64_t written_compressed;
@ -81,7 +82,7 @@ TarImport* tar_import_unref(TarImport *i) {
free(i->temp_path); free(i->temp_path);
} }
import_compress_free(&i->compress); i->compress = compressor_free(i->compress);
sd_event_unref(i->event); sd_event_unref(i->event);
@ -344,13 +345,13 @@ static int tar_import_process(TarImport *i) {
i->buffer_size += l; i->buffer_size += l;
if (i->compress.type == IMPORT_COMPRESS_UNKNOWN) { if (!i->compress) {
if (l == 0) { /* EOF */ if (l == 0) { /* EOF */
log_debug("File too short to be compressed, as no compression signature fits in, thus assuming uncompressed."); log_debug("File too short to be compressed, as no compression signature fits in, thus assuming uncompressed.");
import_uncompress_force_off(&i->compress); decompressor_force_off(&i->compress);
} else { } else {
r = import_uncompress_detect(&i->compress, i->buffer, i->buffer_size); r = decompressor_detect(&i->compress, i->buffer, i->buffer_size);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to detect file compression: %m"); log_error_errno(r, "Failed to detect file compression: %m");
goto finish; goto finish;
@ -364,7 +365,7 @@ static int tar_import_process(TarImport *i) {
goto finish; goto finish;
} }
r = import_uncompress(&i->compress, i->buffer, i->buffer_size, tar_import_write, i); r = decompressor_push(i->compress, i->buffer, i->buffer_size, tar_import_write, i);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to decode and write: %m"); log_error_errno(r, "Failed to decode and write: %m");
goto finish; goto finish;

View File

@ -7,11 +7,7 @@ if conf.get('ENABLE_IMPORTD') != 1
endif endif
common_deps = [ common_deps = [
libbzip2,
libcurl, libcurl,
libxz,
libz,
libzstd,
] ]
executables += [ executables += [
@ -25,7 +21,6 @@ executables += [
'extract' : files( 'extract' : files(
'oci-util.c', 'oci-util.c',
'import-common.c', 'import-common.c',
'import-compress.c',
'qcow2-util.c', 'qcow2-util.c',
), ),
'dependencies' : [common_deps, threads], 'dependencies' : [common_deps, threads],

View File

@ -1,5 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <unistd.h>
#include "sd-id128.h" #include "sd-id128.h"
#include "alloc-util.h" #include "alloc-util.h"

View File

@ -3,6 +3,7 @@
#include <fcntl.h> #include <fcntl.h>
#include <sys/stat.h> #include <sys/stat.h>
#include <sys/xattr.h> #include <sys/xattr.h>
#include <unistd.h>
#include "alloc-util.h" #include "alloc-util.h"
#include "curl-util.h" #include "curl-util.h"
@ -53,7 +54,7 @@ PullJob* pull_job_unref(PullJob *j) {
curl_glue_remove_and_free(j->glue, j->curl); curl_glue_remove_and_free(j->glue, j->curl);
curl_slist_free_all(j->request_header); curl_slist_free_all(j->request_header);
import_compress_free(&j->compress); j->compress = compressor_free(j->compress);
if (j->checksum_ctx) if (j->checksum_ctx)
EVP_MD_CTX_free(j->checksum_ctx); EVP_MD_CTX_free(j->checksum_ctx);
@ -134,7 +135,7 @@ int pull_job_restart(PullJob *j, const char *new_url) {
curl_glue_remove_and_free(j->glue, j->curl); curl_glue_remove_and_free(j->glue, j->curl);
j->curl = NULL; j->curl = NULL;
import_compress_free(&j->compress); j->compress = compressor_free(j->compress);
if (j->checksum_ctx) { if (j->checksum_ctx) {
EVP_MD_CTX_free(j->checksum_ctx); EVP_MD_CTX_free(j->checksum_ctx);
@ -453,7 +454,7 @@ static int pull_job_write_compressed(PullJob *j, const struct iovec *data) {
"Could not hash chunk."); "Could not hash chunk.");
} }
r = import_uncompress(&j->compress, data->iov_base, data->iov_len, pull_job_write_uncompressed, j); r = decompressor_push(j->compress, data->iov_base, data->iov_len, pull_job_write_uncompressed, j);
if (r < 0) if (r < 0)
return r; return r;
@ -502,13 +503,13 @@ static int pull_job_detect_compression(PullJob *j) {
assert(j); assert(j);
r = import_uncompress_detect(&j->compress, j->payload.iov_base, j->payload.iov_len); r = decompressor_detect(&j->compress, j->payload.iov_base, j->payload.iov_len);
if (r < 0) if (r < 0)
return log_error_errno(r, "Failed to initialize compressor: %m"); return log_error_errno(r, "Failed to initialize compressor: %m");
if (r == 0) if (r == 0)
return 0; return 0;
log_debug("Stream is compressed: %s", import_compress_type_to_string(j->compress.type)); log_debug("Stream is compressed: %s", compression_to_string(compressor_type(j->compress)));
r = pull_job_open_disk(j); r = pull_job_open_disk(j);
if (r < 0) if (r < 0)

View File

@ -4,9 +4,9 @@
#include <curl/curl.h> #include <curl/curl.h>
#include <sys/stat.h> #include <sys/stat.h>
#include "shared-forward.h" #include "compress.h"
#include "import-compress.h"
#include "openssl-util.h" #include "openssl-util.h"
#include "shared-forward.h"
typedef struct CurlGlue CurlGlue; typedef struct CurlGlue CurlGlue;
typedef struct PullJob PullJob; typedef struct PullJob PullJob;
@ -73,7 +73,7 @@ typedef struct PullJob {
usec_t mtime; usec_t mtime;
char *content_type; char *content_type;
ImportCompress compress; Compressor *compress;
unsigned progress_percent; unsigned progress_percent;
usec_t start_usec; usec_t start_usec;

View File

@ -1,5 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <unistd.h>
#include "sd-event.h" #include "sd-event.h"
#include "sd-json.h" #include "sd-json.h"
#include "sd-varlink.h" #include "sd-varlink.h"

View File

@ -1,5 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <unistd.h>
#include "sd-daemon.h" #include "sd-daemon.h"
#include "sd-event.h" #include "sd-event.h"

View File

@ -1,5 +1,7 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <unistd.h>
#include "sd-daemon.h" #include "sd-daemon.h"
#include "sd-event.h" #include "sd-event.h"
#include "sd-varlink.h" #include "sd-varlink.h"

View File

@ -1,8 +1,9 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <zlib.h> #include <unistd.h>
#include "alloc-util.h" #include "alloc-util.h"
#include "compress.h"
#include "copy.h" #include "copy.h"
#include "qcow2-util.h" #include "qcow2-util.h"
#include "sparse-endian.h" #include "sparse-endian.h"
@ -97,8 +98,6 @@ static int decompress_cluster(
void *buffer2) { void *buffer2) {
_cleanup_free_ void *large_buffer = NULL; _cleanup_free_ void *large_buffer = NULL;
z_stream s = {};
uint64_t sz;
ssize_t l; ssize_t l;
int r; int r;
@ -119,20 +118,9 @@ static int decompress_cluster(
if ((uint64_t) l != compressed_size) if ((uint64_t) l != compressed_size)
return -EIO; return -EIO;
s.next_in = buffer1; r = decompress_zlib_raw(buffer1, compressed_size, buffer2, cluster_size, /* wbits= */ -12);
s.avail_in = compressed_size; if (r < 0)
s.next_out = buffer2; return r;
s.avail_out = cluster_size;
r = inflateInit2(&s, -12);
if (r != Z_OK)
return -EIO;
r = inflate(&s, Z_FINISH);
sz = (uint8_t*) s.next_out - (uint8_t*) buffer2;
inflateEnd(&s);
if (r != Z_STREAM_END || sz != cluster_size)
return -EIO;
l = pwrite(dfd, buffer2, cluster_size, doffset); l = pwrite(dfd, buffer2, cluster_size, doffset);
if (l < 0) if (l < 0)

View File

@ -130,7 +130,7 @@ int config_parse_compression(
} }
} }
Compression c = compression_lowercase_from_string(word); Compression c = compression_from_string_harder(word);
if (c <= 0 || !compression_supported(c)) { if (c <= 0 || !compression_supported(c)) {
log_syntax(unit, LOG_WARNING, filename, line, c, log_syntax(unit, LOG_WARNING, filename, line, c,
"Compression algorithm '%s' is not supported on the system, ignoring.", word); "Compression algorithm '%s' is not supported on the system, ignoring.", word);

View File

@ -209,7 +209,7 @@ static int build_accept_encoding(char **ret) {
const CompressionConfig *cc; const CompressionConfig *cc;
ORDERED_HASHMAP_FOREACH(cc, arg_compression) { ORDERED_HASHMAP_FOREACH(cc, arg_compression) {
const char *c = compression_lowercase_to_string(cc->algorithm); const char *c = compression_to_string(cc->algorithm);
if (strextendf_with_separator(&buf, ",", "%s;q=%.1f", c, q) < 0) if (strextendf_with_separator(&buf, ",", "%s;q=%.1f", c, q) < 0)
return -ENOMEM; return -ENOMEM;
q -= step; q -= step;
@ -361,7 +361,7 @@ static mhd_result request_handler(
RemoteSource *source = *connection_cls; RemoteSource *source = *connection_cls;
header = MHD_lookup_connection_value(connection, MHD_HEADER_KIND, "Content-Encoding"); header = MHD_lookup_connection_value(connection, MHD_HEADER_KIND, "Content-Encoding");
if (header) { if (header) {
Compression c = compression_lowercase_from_string(header); Compression c = compression_from_string_harder(header);
if (c <= 0 || !compression_supported(c)) if (c <= 0 || !compression_supported(c))
return mhd_respondf(connection, 0, MHD_HTTP_UNSUPPORTED_MEDIA_TYPE, return mhd_respondf(connection, 0, MHD_HTTP_UNSUPPORTED_MEDIA_TYPE,
"Unsupported Content-Encoding type: %s", header); "Unsupported Content-Encoding type: %s", header);

View File

@ -354,7 +354,7 @@ static size_t journal_input_callback(void *buf, size_t size, size_t nmemb, void
r = compress_blob(u->compression->algorithm, compression_buffer, filled, buf, size * nmemb, &compressed_size, u->compression->level); r = compress_blob(u->compression->algorithm, compression_buffer, filled, buf, size * nmemb, &compressed_size, u->compression->level);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to compress %zu bytes by %s with level %i: %m", log_error_errno(r, "Failed to compress %zu bytes by %s with level %i: %m",
filled, compression_lowercase_to_string(u->compression->algorithm), u->compression->level); filled, compression_to_string(u->compression->algorithm), u->compression->level);
return CURL_READFUNC_ABORT; return CURL_READFUNC_ABORT;
} }

View File

@ -218,7 +218,7 @@ int start_upload(Uploader *u,
h = l; h = l;
if (u->compression) { if (u->compression) {
_cleanup_free_ char *header = strjoin("Content-Encoding: ", compression_lowercase_to_string(u->compression->algorithm)); _cleanup_free_ char *header = strjoin("Content-Encoding: ", compression_to_string(u->compression->algorithm));
if (!header) if (!header)
return log_oom(); return log_oom();
@ -369,7 +369,7 @@ static size_t fd_input_callback(void *buf, size_t size, size_t nmemb, void *user
r = compress_blob(u->compression->algorithm, compression_buffer, n, buf, size * nmemb, &compressed_size, u->compression->level); r = compress_blob(u->compression->algorithm, compression_buffer, n, buf, size * nmemb, &compressed_size, u->compression->level);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Failed to compress %zd bytes by %s with level %i: %m", log_error_errno(r, "Failed to compress %zd bytes by %s with level %i: %m",
n, compression_lowercase_to_string(u->compression->algorithm), u->compression->level); n, compression_to_string(u->compression->algorithm), u->compression->level);
return CURL_READFUNC_ABORT; return CURL_READFUNC_ABORT;
} }
assert(compressed_size <= size * nmemb); assert(compressed_size <= size * nmemb);
@ -528,7 +528,7 @@ static int update_content_encoding_header(Uploader *u, const CompressionConfig *
return 0; /* Already picked the algorithm. Let's shortcut. */ return 0; /* Already picked the algorithm. Let's shortcut. */
if (cc) { if (cc) {
_cleanup_free_ char *header = strjoin("Content-Encoding: ", compression_lowercase_to_string(cc->algorithm)); _cleanup_free_ char *header = strjoin("Content-Encoding: ", compression_to_string(cc->algorithm));
if (!header) if (!header)
return log_oom(); return log_oom();
@ -572,7 +572,7 @@ static int update_content_encoding_header(Uploader *u, const CompressionConfig *
u->compression = cc; u->compression = cc;
if (cc) if (cc)
log_debug("Using compression algorithm %s with compression level %i.", compression_lowercase_to_string(cc->algorithm), cc->level); log_debug("Using compression algorithm %s with compression level %i.", compression_to_string(cc->algorithm), cc->level);
else else
log_debug("Disabled compression algorithm."); log_debug("Disabled compression algorithm.");
return 0; return 0;
@ -610,7 +610,7 @@ static int parse_accept_encoding_header(Uploader *u) {
if (streq(word, "*")) if (streq(word, "*"))
return update_content_encoding_header(u, ordered_hashmap_first(arg_compression)); return update_content_encoding_header(u, ordered_hashmap_first(arg_compression));
Compression c = compression_lowercase_from_string(word); Compression c = compression_from_string_harder(word);
if (c <= 0 || !compression_supported(c)) if (c <= 0 || !compression_supported(c))
continue; /* unsupported or invalid algorithm. */ continue; /* unsupported or invalid algorithm. */

View File

@ -1096,4 +1096,10 @@ global:
sd_varlink_call_and_upgrade; sd_varlink_call_and_upgrade;
sd_varlink_reply_and_upgrade; sd_varlink_reply_and_upgrade;
sd_varlink_set_sentinel; sd_varlink_set_sentinel;
sd_event_add_cpu_pressure;
sd_event_source_set_cpu_pressure_type;
sd_event_source_set_cpu_pressure_period;
sd_event_add_io_pressure;
sd_event_source_set_io_pressure_type;
sd_event_source_set_io_pressure_period;
} LIBSYSTEMD_260; } LIBSYSTEMD_260;

View File

@ -26,6 +26,8 @@ typedef enum EventSourceType {
SOURCE_WATCHDOG, SOURCE_WATCHDOG,
SOURCE_INOTIFY, SOURCE_INOTIFY,
SOURCE_MEMORY_PRESSURE, SOURCE_MEMORY_PRESSURE,
SOURCE_CPU_PRESSURE,
SOURCE_IO_PRESSURE,
_SOURCE_EVENT_SOURCE_TYPE_MAX, _SOURCE_EVENT_SOURCE_TYPE_MAX,
_SOURCE_EVENT_SOURCE_TYPE_INVALID = -EINVAL, _SOURCE_EVENT_SOURCE_TYPE_INVALID = -EINVAL,
} EventSourceType; } EventSourceType;
@ -144,7 +146,7 @@ struct sd_event_source {
size_t write_buffer_size; size_t write_buffer_size;
uint32_t events, revents; uint32_t events, revents;
LIST_FIELDS(sd_event_source, write_list); LIST_FIELDS(sd_event_source, write_list);
} memory_pressure; } pressure;
}; };
}; };

View File

@ -76,6 +76,8 @@ static const char* const event_source_type_table[_SOURCE_EVENT_SOURCE_TYPE_MAX]
[SOURCE_WATCHDOG] = "watchdog", [SOURCE_WATCHDOG] = "watchdog",
[SOURCE_INOTIFY] = "inotify", [SOURCE_INOTIFY] = "inotify",
[SOURCE_MEMORY_PRESSURE] = "memory-pressure", [SOURCE_MEMORY_PRESSURE] = "memory-pressure",
[SOURCE_CPU_PRESSURE] = "cpu-pressure",
[SOURCE_IO_PRESSURE] = "io-pressure",
}; };
DEFINE_PRIVATE_STRING_TABLE_LOOKUP_TO_STRING(event_source_type, int); DEFINE_PRIVATE_STRING_TABLE_LOOKUP_TO_STRING(event_source_type, int);
@ -99,7 +101,9 @@ DEFINE_PRIVATE_STRING_TABLE_LOOKUP_TO_STRING(event_source_type, int);
SOURCE_SIGNAL, \ SOURCE_SIGNAL, \
SOURCE_DEFER, \ SOURCE_DEFER, \
SOURCE_INOTIFY, \ SOURCE_INOTIFY, \
SOURCE_MEMORY_PRESSURE) SOURCE_MEMORY_PRESSURE, \
SOURCE_CPU_PRESSURE, \
SOURCE_IO_PRESSURE)
/* This is used to assert that we didn't pass an unexpected source type to event_source_time_prioq_put(). /* This is used to assert that we didn't pass an unexpected source type to event_source_time_prioq_put().
* Time sources and ratelimited sources can be passed, so effectively this is the same as the * Time sources and ratelimited sources can be passed, so effectively this is the same as the
@ -144,8 +148,8 @@ struct sd_event {
/* A list of inotify objects that already have events buffered which aren't processed yet */ /* A list of inotify objects that already have events buffered which aren't processed yet */
LIST_HEAD(InotifyData, buffered_inotify_data_list); LIST_HEAD(InotifyData, buffered_inotify_data_list);
/* A list of memory pressure event sources that still need their subscription string written */ /* A list of pressure event sources that still need their subscription string written */
LIST_HEAD(sd_event_source, memory_pressure_write_list); LIST_HEAD(sd_event_source, pressure_write_list);
uint64_t origin_id; uint64_t origin_id;
@ -564,63 +568,65 @@ static int source_child_pidfd_register(sd_event_source *s, int enabled) {
return 0; return 0;
} }
static void source_memory_pressure_unregister(sd_event_source *s) { #define EVENT_SOURCE_IS_PRESSURE(s) IN_SET((s)->type, SOURCE_MEMORY_PRESSURE, SOURCE_CPU_PRESSURE, SOURCE_IO_PRESSURE)
static void source_pressure_unregister(sd_event_source *s) {
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
if (event_origin_changed(s->event)) if (event_origin_changed(s->event))
return; return;
if (!s->memory_pressure.registered) if (!s->pressure.registered)
return; return;
if (epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, s->memory_pressure.fd, NULL) < 0) if (epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, s->pressure.fd, NULL) < 0)
log_debug_errno(errno, "Failed to remove source %s (type %s) from epoll, ignoring: %m", log_debug_errno(errno, "Failed to remove source %s (type %s) from epoll, ignoring: %m",
strna(s->description), event_source_type_to_string(s->type)); strna(s->description), event_source_type_to_string(s->type));
s->memory_pressure.registered = false; s->pressure.registered = false;
} }
static int source_memory_pressure_register(sd_event_source *s, int enabled) { static int source_pressure_register(sd_event_source *s, int enabled) {
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
assert(enabled != SD_EVENT_OFF); assert(enabled != SD_EVENT_OFF);
struct epoll_event ev = { struct epoll_event ev = {
.events = s->memory_pressure.write_buffer_size > 0 ? EPOLLOUT : .events = s->pressure.write_buffer_size > 0 ? EPOLLOUT :
(s->memory_pressure.events | (enabled == SD_EVENT_ONESHOT ? EPOLLONESHOT : 0)), (s->pressure.events | (enabled == SD_EVENT_ONESHOT ? EPOLLONESHOT : 0)),
.data.ptr = s, .data.ptr = s,
}; };
if (epoll_ctl(s->event->epoll_fd, if (epoll_ctl(s->event->epoll_fd,
s->memory_pressure.registered ? EPOLL_CTL_MOD : EPOLL_CTL_ADD, s->pressure.registered ? EPOLL_CTL_MOD : EPOLL_CTL_ADD,
s->memory_pressure.fd, &ev) < 0) s->pressure.fd, &ev) < 0)
return -errno; return -errno;
s->memory_pressure.registered = true; s->pressure.registered = true;
return 0; return 0;
} }
static void source_memory_pressure_add_to_write_list(sd_event_source *s) { static void source_pressure_add_to_write_list(sd_event_source *s) {
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
if (s->memory_pressure.in_write_list) if (s->pressure.in_write_list)
return; return;
LIST_PREPEND(memory_pressure.write_list, s->event->memory_pressure_write_list, s); LIST_PREPEND(pressure.write_list, s->event->pressure_write_list, s);
s->memory_pressure.in_write_list = true; s->pressure.in_write_list = true;
} }
static void source_memory_pressure_remove_from_write_list(sd_event_source *s) { static void source_pressure_remove_from_write_list(sd_event_source *s) {
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
if (!s->memory_pressure.in_write_list) if (!s->pressure.in_write_list)
return; return;
LIST_REMOVE(memory_pressure.write_list, s->event->memory_pressure_write_list, s); LIST_REMOVE(pressure.write_list, s->event->pressure_write_list, s);
s->memory_pressure.in_write_list = false; s->pressure.in_write_list = false;
} }
static clockid_t event_source_type_to_clock(EventSourceType t) { static clockid_t event_source_type_to_clock(EventSourceType t) {
@ -1047,8 +1053,10 @@ static void source_disconnect(sd_event_source *s) {
} }
case SOURCE_MEMORY_PRESSURE: case SOURCE_MEMORY_PRESSURE:
source_memory_pressure_remove_from_write_list(s); case SOURCE_CPU_PRESSURE:
source_memory_pressure_unregister(s); case SOURCE_IO_PRESSURE:
source_pressure_remove_from_write_list(s);
source_pressure_unregister(s);
break; break;
default: default:
@ -1111,9 +1119,9 @@ static sd_event_source* source_free(sd_event_source *s) {
s->child.pidfd = safe_close(s->child.pidfd); s->child.pidfd = safe_close(s->child.pidfd);
} }
if (s->type == SOURCE_MEMORY_PRESSURE) { if (EVENT_SOURCE_IS_PRESSURE(s)) {
s->memory_pressure.fd = safe_close(s->memory_pressure.fd); s->pressure.fd = safe_close(s->pressure.fd);
s->memory_pressure.write_buffer = mfree(s->memory_pressure.write_buffer); s->pressure.write_buffer = mfree(s->pressure.write_buffer);
} }
if (s->destroy_callback) if (s->destroy_callback)
@ -1191,7 +1199,9 @@ static sd_event_source* source_new(sd_event *e, bool floating, EventSourceType t
[SOURCE_POST] = endoffsetof_field(sd_event_source, post), [SOURCE_POST] = endoffsetof_field(sd_event_source, post),
[SOURCE_EXIT] = endoffsetof_field(sd_event_source, exit), [SOURCE_EXIT] = endoffsetof_field(sd_event_source, exit),
[SOURCE_INOTIFY] = endoffsetof_field(sd_event_source, inotify), [SOURCE_INOTIFY] = endoffsetof_field(sd_event_source, inotify),
[SOURCE_MEMORY_PRESSURE] = endoffsetof_field(sd_event_source, memory_pressure), [SOURCE_MEMORY_PRESSURE] = endoffsetof_field(sd_event_source, pressure),
[SOURCE_CPU_PRESSURE] = endoffsetof_field(sd_event_source, pressure),
[SOURCE_IO_PRESSURE] = endoffsetof_field(sd_event_source, pressure),
}; };
sd_event_source *s; sd_event_source *s;
@ -1917,17 +1927,21 @@ static int memory_pressure_callback(sd_event_source *s, void *userdata) {
return 0; return 0;
} }
_public_ int sd_event_add_memory_pressure( static int event_add_pressure(
sd_event *e, sd_event *e,
sd_event_source **ret, sd_event_source **ret,
sd_event_handler_t callback, sd_event_handler_t callback,
void *userdata) { void *userdata,
EventSourceType type,
sd_event_handler_t default_callback,
PressureResource resource) {
_cleanup_free_ char *w = NULL; _cleanup_free_ char *w = NULL;
_cleanup_(source_freep) sd_event_source *s = NULL; _cleanup_(source_freep) sd_event_source *s = NULL;
_cleanup_close_ int path_fd = -EBADF, fd = -EBADF; _cleanup_close_ int path_fd = -EBADF, fd = -EBADF;
_cleanup_free_ void *write_buffer = NULL; _cleanup_free_ void *write_buffer = NULL;
const char *watch, *watch_fallback = NULL, *env; _cleanup_free_ char *watch_fallback = NULL;
const char *watch, *env;
size_t write_buffer_size = 0; size_t write_buffer_size = 0;
struct stat st; struct stat st;
uint32_t events; uint32_t events;
@ -1939,32 +1953,34 @@ _public_ int sd_event_add_memory_pressure(
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_origin_changed(e), -ECHILD); assert_return(!event_origin_changed(e), -ECHILD);
if (!callback) const PressureResourceInfo *info = pressure_resource_get_info(resource);
callback = memory_pressure_callback;
s = source_new(e, !ret, SOURCE_MEMORY_PRESSURE); if (!callback)
callback = default_callback;
s = source_new(e, !ret, type);
if (!s) if (!s)
return -ENOMEM; return -ENOMEM;
s->wakeup = WAKEUP_EVENT_SOURCE; s->wakeup = WAKEUP_EVENT_SOURCE;
s->memory_pressure.callback = callback; s->pressure.callback = callback;
s->userdata = userdata; s->userdata = userdata;
s->enabled = SD_EVENT_ON; s->enabled = SD_EVENT_ON;
s->memory_pressure.fd = -EBADF; s->pressure.fd = -EBADF;
env = secure_getenv("MEMORY_PRESSURE_WATCH"); env = secure_getenv(info->env_watch);
if (env) { if (env) {
if (isempty(env) || path_equal(env, "/dev/null")) if (isempty(env) || path_equal(env, "/dev/null"))
return log_debug_errno(SYNTHETIC_ERRNO(EHOSTDOWN), return log_debug_errno(SYNTHETIC_ERRNO(EHOSTDOWN),
"Memory pressure logic is explicitly disabled via $MEMORY_PRESSURE_WATCH."); "Pressure logic is explicitly disabled via $%s.", info->env_watch);
if (!path_is_absolute(env) || !path_is_normalized(env)) if (!path_is_absolute(env) || !path_is_normalized(env))
return log_debug_errno(SYNTHETIC_ERRNO(EBADMSG), return log_debug_errno(SYNTHETIC_ERRNO(EBADMSG),
"$MEMORY_PRESSURE_WATCH set to invalid path: %s", env); "$%s set to invalid path: %s", info->env_watch, env);
watch = env; watch = env;
env = secure_getenv("MEMORY_PRESSURE_WRITE"); env = secure_getenv(info->env_write);
if (env) { if (env) {
r = unbase64mem(env, &write_buffer, &write_buffer_size); r = unbase64mem(env, &write_buffer, &write_buffer_size);
if (r < 0) if (r < 0)
@ -1980,8 +1996,8 @@ _public_ int sd_event_add_memory_pressure(
if (r == 0) if (r == 0)
return -EOPNOTSUPP; return -EOPNOTSUPP;
/* By default we want to watch memory pressure on the local cgroup, but we'll fall back on /* By default we want to watch pressure on the local cgroup, but we'll fall back on
* the system wide pressure if for some reason we cannot (which could be: memory controller * the system wide pressure if for some reason we cannot (which could be: controller
* not delegated to us, or PSI simply not available in the kernel). */ * not delegated to us, or PSI simply not available in the kernel). */
_cleanup_free_ char *cg = NULL; _cleanup_free_ char *cg = NULL;
@ -1989,12 +2005,19 @@ _public_ int sd_event_add_memory_pressure(
if (r < 0) if (r < 0)
return r; return r;
w = path_join("/sys/fs/cgroup", cg, "memory.pressure"); _cleanup_free_ char *cgroup_file = strjoin(info->name, ".pressure");
if (!cgroup_file)
return -ENOMEM;
w = path_join("/sys/fs/cgroup", cg, cgroup_file);
if (!w) if (!w)
return -ENOMEM; return -ENOMEM;
watch = w; watch = w;
watch_fallback = "/proc/pressure/memory";
watch_fallback = strjoin("/proc/pressure/", info->name);
if (!watch_fallback)
return -ENOMEM;
/* Android uses three levels in its userspace low memory killer logic: /* Android uses three levels in its userspace low memory killer logic:
* some 70000 1000000 * some 70000 1000000
@ -2011,9 +2034,9 @@ _public_ int sd_event_add_memory_pressure(
* kernel will allow us to do unprivileged, also in the future. */ * kernel will allow us to do unprivileged, also in the future. */
if (asprintf((char**) &write_buffer, if (asprintf((char**) &write_buffer,
"%s " USEC_FMT " " USEC_FMT, "%s " USEC_FMT " " USEC_FMT,
MEMORY_PRESSURE_DEFAULT_TYPE, PRESSURE_DEFAULT_TYPE,
MEMORY_PRESSURE_DEFAULT_THRESHOLD_USEC, PRESSURE_DEFAULT_THRESHOLD_USEC,
MEMORY_PRESSURE_DEFAULT_WINDOW_USEC) < 0) PRESSURE_DEFAULT_WINDOW_USEC) < 0)
return -ENOMEM; return -ENOMEM;
write_buffer_size = strlen(write_buffer) + 1; write_buffer_size = strlen(write_buffer) + 1;
@ -2080,24 +2103,24 @@ _public_ int sd_event_add_memory_pressure(
else else
return -EBADF; return -EBADF;
s->memory_pressure.fd = TAKE_FD(fd); s->pressure.fd = TAKE_FD(fd);
s->memory_pressure.write_buffer = TAKE_PTR(write_buffer); s->pressure.write_buffer = TAKE_PTR(write_buffer);
s->memory_pressure.write_buffer_size = write_buffer_size; s->pressure.write_buffer_size = write_buffer_size;
s->memory_pressure.events = events; s->pressure.events = events;
s->memory_pressure.locked = locked; s->pressure.locked = locked;
/* So here's the thing: if we are talking to PSI we need to write the watch string before adding the /* So here's the thing: if we are talking to PSI we need to write the watch string before adding the
* fd to epoll (if we ignore this, then the watch won't work). Hence we'll not actually register the * fd to epoll (if we ignore this, then the watch won't work). Hence we'll not actually register the
* fd with the epoll right-away. Instead, we just add the event source to a list of memory pressure * fd with the epoll right-away. Instead, we just add the event source to a list of pressure event
* event sources on which writes must be executed before the first event loop iteration is * sources on which writes must be executed before the first event loop iteration is executed. (We
* executed. (We could also write the data here, right away, but we want to give the caller the * could also write the data here, right away, but we want to give the caller the freedom to call
* freedom to call sd_event_source_set_memory_pressure_type() and * sd_event_source_set_{memory,cpu,io}_pressure_type() and
* sd_event_source_set_memory_pressure_rate() before we write it. */ * sd_event_source_set_{memory,cpu,io}_pressure_period() before we write it. */
if (s->memory_pressure.write_buffer_size > 0) if (s->pressure.write_buffer_size > 0)
source_memory_pressure_add_to_write_list(s); source_pressure_add_to_write_list(s);
else { else {
r = source_memory_pressure_register(s, s->enabled); r = source_pressure_register(s, s->enabled);
if (r < 0) if (r < 0)
return r; return r;
} }
@ -2109,6 +2132,57 @@ _public_ int sd_event_add_memory_pressure(
return 0; return 0;
} }
_public_ int sd_event_add_memory_pressure(
sd_event *e,
sd_event_source **ret,
sd_event_handler_t callback,
void *userdata) {
return event_add_pressure(
e, ret, callback, userdata,
SOURCE_MEMORY_PRESSURE,
memory_pressure_callback,
PRESSURE_MEMORY);
}
static int cpu_pressure_callback(sd_event_source *s, void *userdata) {
assert(s);
return 0;
}
_public_ int sd_event_add_cpu_pressure(
sd_event *e,
sd_event_source **ret,
sd_event_handler_t callback,
void *userdata) {
return event_add_pressure(
e, ret, callback, userdata,
SOURCE_CPU_PRESSURE,
cpu_pressure_callback,
PRESSURE_CPU);
}
static int io_pressure_callback(sd_event_source *s, void *userdata) {
assert(s);
return 0;
}
_public_ int sd_event_add_io_pressure(
sd_event *e,
sd_event_source **ret,
sd_event_handler_t callback,
void *userdata) {
return event_add_pressure(
e, ret, callback, userdata,
SOURCE_IO_PRESSURE,
io_pressure_callback,
PRESSURE_IO);
}
static void event_free_inotify_data(sd_event *e, InotifyData *d) { static void event_free_inotify_data(sd_event *e, InotifyData *d) {
assert(e); assert(e);
@ -2910,7 +2984,9 @@ static int event_source_offline(
break; break;
case SOURCE_MEMORY_PRESSURE: case SOURCE_MEMORY_PRESSURE:
source_memory_pressure_unregister(s); case SOURCE_CPU_PRESSURE:
case SOURCE_IO_PRESSURE:
source_pressure_unregister(s);
break; break;
case SOURCE_TIME_REALTIME: case SOURCE_TIME_REALTIME:
@ -3001,10 +3077,12 @@ static int event_source_online(
break; break;
case SOURCE_MEMORY_PRESSURE: case SOURCE_MEMORY_PRESSURE:
/* As documented in sd_event_add_memory_pressure(), we can only register the PSI fd with case SOURCE_CPU_PRESSURE:
* epoll after writing the watch string. */ case SOURCE_IO_PRESSURE:
if (s->memory_pressure.write_buffer_size == 0) { /* As documented in sd_event_add_{memory,cpu,io}_pressure(), we can only register the PSI fd
r = source_memory_pressure_register(s, enabled); * with epoll after writing the watch string. */
if (s->pressure.write_buffer_size == 0) {
r = source_pressure_register(s, enabled);
if (r < 0) if (r < 0)
return r; return r;
} }
@ -3986,30 +4064,30 @@ static int process_inotify(sd_event *e) {
return done; return done;
} }
static int process_memory_pressure(sd_event_source *s, uint32_t revents) { static int process_pressure(sd_event_source *s, uint32_t revents) {
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
if (s->pending) if (s->pending)
s->memory_pressure.revents |= revents; s->pressure.revents |= revents;
else else
s->memory_pressure.revents = revents; s->pressure.revents = revents;
return source_set_pending(s, true); return source_set_pending(s, true);
} }
static int source_memory_pressure_write(sd_event_source *s) { static int source_pressure_write(sd_event_source *s) {
ssize_t n; ssize_t n;
int r; int r;
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
/* once we start writing, the buffer is locked, we allow no further changes. */ /* once we start writing, the buffer is locked, we allow no further changes. */
s->memory_pressure.locked = true; s->pressure.locked = true;
if (s->memory_pressure.write_buffer_size > 0) { if (s->pressure.write_buffer_size > 0) {
n = write(s->memory_pressure.fd, s->memory_pressure.write_buffer, s->memory_pressure.write_buffer_size); n = write(s->pressure.fd, s->pressure.write_buffer, s->pressure.write_buffer_size);
if (n < 0) { if (n < 0) {
if (!ERRNO_IS_TRANSIENT(errno)) { if (!ERRNO_IS_TRANSIENT(errno)) {
/* If kernel is built with CONFIG_PSI_DEFAULT_DISABLED it will expose PSI /* If kernel is built with CONFIG_PSI_DEFAULT_DISABLED it will expose PSI
@ -4018,7 +4096,7 @@ static int source_memory_pressure_write(sd_event_source *s) {
* so late. Let's make the best of it, and turn off the event source like we * so late. Let's make the best of it, and turn off the event source like we
* do for failed event source handlers. */ * do for failed event source handlers. */
log_debug_errno(errno, "Writing memory pressure settings to kernel failed, disabling memory pressure event source: %m"); log_debug_errno(errno, "Writing pressure settings to kernel failed, disabling pressure event source: %m");
assert_se(sd_event_source_set_enabled(s, SD_EVENT_OFF) >= 0); assert_se(sd_event_source_set_enabled(s, SD_EVENT_OFF) >= 0);
return 0; return 0;
} }
@ -4030,41 +4108,41 @@ static int source_memory_pressure_write(sd_event_source *s) {
assert(n >= 0); assert(n >= 0);
if ((size_t) n == s->memory_pressure.write_buffer_size) { if ((size_t) n == s->pressure.write_buffer_size) {
s->memory_pressure.write_buffer = mfree(s->memory_pressure.write_buffer); s->pressure.write_buffer = mfree(s->pressure.write_buffer);
if (n > 0) { if (n > 0) {
s->memory_pressure.write_buffer_size = 0; s->pressure.write_buffer_size = 0;
/* Update epoll events mask, since we have now written everything and don't care for EPOLLOUT anymore */ /* Update epoll events mask, since we have now written everything and don't care for EPOLLOUT anymore */
r = source_memory_pressure_register(s, s->enabled); r = source_pressure_register(s, s->enabled);
if (r < 0) if (r < 0)
return r; return r;
} }
} else if (n > 0) { } else if (n > 0) {
_cleanup_free_ void *c = NULL; _cleanup_free_ void *c = NULL;
assert((size_t) n < s->memory_pressure.write_buffer_size); assert((size_t) n < s->pressure.write_buffer_size);
c = memdup((uint8_t*) s->memory_pressure.write_buffer + n, s->memory_pressure.write_buffer_size - n); c = memdup((uint8_t*) s->pressure.write_buffer + n, s->pressure.write_buffer_size - n);
if (!c) if (!c)
return -ENOMEM; return -ENOMEM;
free_and_replace(s->memory_pressure.write_buffer, c); free_and_replace(s->pressure.write_buffer, c);
s->memory_pressure.write_buffer_size -= n; s->pressure.write_buffer_size -= n;
return 1; return 1;
} }
return 0; return 0;
} }
static int source_memory_pressure_initiate_dispatch(sd_event_source *s) { static int source_pressure_initiate_dispatch(sd_event_source *s) {
int r; int r;
assert(s); assert(s);
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
r = source_memory_pressure_write(s); r = source_pressure_write(s);
if (r < 0) if (r < 0)
return r; return r;
if (r > 0) if (r > 0)
@ -4072,22 +4150,22 @@ static int source_memory_pressure_initiate_dispatch(sd_event_source *s) {
* function. Instead, shortcut it so that we wait for next EPOLLOUT immediately. */ * function. Instead, shortcut it so that we wait for next EPOLLOUT immediately. */
/* No pending incoming IO? Then let's not continue further */ /* No pending incoming IO? Then let's not continue further */
if ((s->memory_pressure.revents & (EPOLLIN|EPOLLPRI)) == 0) { if ((s->pressure.revents & (EPOLLIN|EPOLLPRI)) == 0) {
/* Treat IO errors on the notifier the same ways errors returned from a callback */ /* Treat IO errors on the notifier the same ways errors returned from a callback */
if ((s->memory_pressure.revents & (EPOLLHUP|EPOLLERR|EPOLLRDHUP)) != 0) if ((s->pressure.revents & (EPOLLHUP|EPOLLERR|EPOLLRDHUP)) != 0)
return -EIO; return -EIO;
return 1; /* leave dispatch, we already processed everything */ return 1; /* leave dispatch, we already processed everything */
} }
if (s->memory_pressure.revents & EPOLLIN) { if (s->pressure.revents & EPOLLIN) {
uint8_t pipe_buf[PIPE_BUF]; uint8_t pipe_buf[PIPE_BUF];
ssize_t n; ssize_t n;
/* If the fd is readable, then flush out anything that might be queued */ /* If the fd is readable, then flush out anything that might be queued */
n = read(s->memory_pressure.fd, pipe_buf, sizeof(pipe_buf)); n = read(s->pressure.fd, pipe_buf, sizeof(pipe_buf));
if (n < 0 && !ERRNO_IS_TRANSIENT(errno)) if (n < 0 && !ERRNO_IS_TRANSIENT(errno))
return -errno; return -errno;
} }
@ -4158,8 +4236,8 @@ static int source_dispatch(sd_event_source *s) {
if (r < 0) if (r < 0)
return r; return r;
if (s->type == SOURCE_MEMORY_PRESSURE) { if (EVENT_SOURCE_IS_PRESSURE(s)) {
r = source_memory_pressure_initiate_dispatch(s); r = source_pressure_initiate_dispatch(s);
if (r == -EIO) /* handle EIO errors similar to callback errors */ if (r == -EIO) /* handle EIO errors similar to callback errors */
goto finish; goto finish;
if (r < 0) if (r < 0)
@ -4254,7 +4332,9 @@ static int source_dispatch(sd_event_source *s) {
} }
case SOURCE_MEMORY_PRESSURE: case SOURCE_MEMORY_PRESSURE:
r = s->memory_pressure.callback(s, s->userdata); case SOURCE_CPU_PRESSURE:
case SOURCE_IO_PRESSURE:
r = s->pressure.callback(s, s->userdata);
break; break;
case SOURCE_WATCHDOG: case SOURCE_WATCHDOG:
@ -4422,7 +4502,7 @@ static void event_close_inode_data_fds(sd_event *e) {
} }
} }
static int event_memory_pressure_write_list(sd_event *e) { static int event_pressure_write_list(sd_event *e) {
int r; int r;
assert(e); assert(e);
@ -4430,15 +4510,15 @@ static int event_memory_pressure_write_list(sd_event *e) {
for (;;) { for (;;) {
sd_event_source *s; sd_event_source *s;
s = LIST_POP(memory_pressure.write_list, e->memory_pressure_write_list); s = LIST_POP(pressure.write_list, e->pressure_write_list);
if (!s) if (!s)
break; break;
assert(s->type == SOURCE_MEMORY_PRESSURE); assert(EVENT_SOURCE_IS_PRESSURE(s));
assert(s->memory_pressure.write_buffer_size > 0); assert(s->pressure.write_buffer_size > 0);
s->memory_pressure.in_write_list = false; s->pressure.in_write_list = false;
r = source_memory_pressure_write(s); r = source_pressure_write(s);
if (r < 0) if (r < 0)
return r; return r;
} }
@ -4499,7 +4579,7 @@ _public_ int sd_event_prepare(sd_event *e) {
if (r < 0) if (r < 0)
return r; return r;
r = event_memory_pressure_write_list(e); r = event_pressure_write_list(e);
if (r < 0) if (r < 0)
return r; return r;
@ -4668,7 +4748,9 @@ static int process_epoll(sd_event *e, usec_t timeout, int64_t threshold, int64_t
break; break;
case SOURCE_MEMORY_PRESSURE: case SOURCE_MEMORY_PRESSURE:
r = process_memory_pressure(s, i->events); case SOURCE_CPU_PRESSURE:
case SOURCE_IO_PRESSURE:
r = process_pressure(s, i->events);
break; break;
default: default:
@ -5306,27 +5388,27 @@ _public_ int sd_event_get_exit_on_idle(sd_event *e) {
return e->exit_on_idle; return e->exit_on_idle;
} }
_public_ int sd_event_source_set_memory_pressure_type(sd_event_source *s, const char *ty) { static int event_source_set_pressure_type(sd_event_source *s, const char *ty) {
_cleanup_free_ char *b = NULL; _cleanup_free_ char *b = NULL;
_cleanup_free_ void *w = NULL; _cleanup_free_ void *w = NULL;
assert_return(s, -EINVAL); assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_MEMORY_PRESSURE, -EDOM); assert_return(EVENT_SOURCE_IS_PRESSURE(s), -EDOM);
assert_return(ty, -EINVAL); assert_return(ty, -EINVAL);
assert_return(!event_origin_changed(s->event), -ECHILD); assert_return(!event_origin_changed(s->event), -ECHILD);
if (!STR_IN_SET(ty, "some", "full")) if (!STR_IN_SET(ty, "some", "full"))
return -EINVAL; return -EINVAL;
if (s->memory_pressure.locked) /* Refuse adjusting parameters, if caller told us how to watch for events */ if (s->pressure.locked) /* Refuse adjusting parameters, if caller told us how to watch for events */
return -EBUSY; return -EBUSY;
char* space = memchr(s->memory_pressure.write_buffer, ' ', s->memory_pressure.write_buffer_size); char* space = memchr(s->pressure.write_buffer, ' ', s->pressure.write_buffer_size);
if (!space) if (!space)
return -EINVAL; return -EINVAL;
size_t l = space - (char*) s->memory_pressure.write_buffer; size_t l = space - (char*) s->pressure.write_buffer;
b = memdup_suffix0(s->memory_pressure.write_buffer, l); b = memdup_suffix0(s->pressure.write_buffer, l);
if (!b) if (!b)
return -ENOMEM; return -ENOMEM;
if (!STR_IN_SET(b, "some", "full")) if (!STR_IN_SET(b, "some", "full"))
@ -5335,26 +5417,47 @@ _public_ int sd_event_source_set_memory_pressure_type(sd_event_source *s, const
if (streq(b, ty)) if (streq(b, ty))
return 0; return 0;
size_t nl = strlen(ty) + (s->memory_pressure.write_buffer_size - l); size_t nl = strlen(ty) + (s->pressure.write_buffer_size - l);
w = new(char, nl); w = new(char, nl);
if (!w) if (!w)
return -ENOMEM; return -ENOMEM;
memcpy(stpcpy(w, ty), space, (s->memory_pressure.write_buffer_size - l)); memcpy(stpcpy(w, ty), space, (s->pressure.write_buffer_size - l));
free_and_replace(s->memory_pressure.write_buffer, w); free_and_replace(s->pressure.write_buffer, w);
s->memory_pressure.write_buffer_size = nl; s->pressure.write_buffer_size = nl;
s->memory_pressure.locked = false; s->pressure.locked = false;
return 1; return 1;
} }
_public_ int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec) { _public_ int sd_event_source_set_memory_pressure_type(sd_event_source *s, const char *ty) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_MEMORY_PRESSURE, -EDOM);
return event_source_set_pressure_type(s, ty);
}
_public_ int sd_event_source_set_cpu_pressure_type(sd_event_source *s, const char *ty) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_CPU_PRESSURE, -EDOM);
return event_source_set_pressure_type(s, ty);
}
_public_ int sd_event_source_set_io_pressure_type(sd_event_source *s, const char *ty) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_IO_PRESSURE, -EDOM);
return event_source_set_pressure_type(s, ty);
}
static int event_source_set_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec) {
_cleanup_free_ char *b = NULL; _cleanup_free_ char *b = NULL;
_cleanup_free_ void *w = NULL; _cleanup_free_ void *w = NULL;
assert_return(s, -EINVAL); assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_MEMORY_PRESSURE, -EDOM); assert_return(EVENT_SOURCE_IS_PRESSURE(s), -EDOM);
assert_return(!event_origin_changed(s->event), -ECHILD); assert_return(!event_origin_changed(s->event), -ECHILD);
if (threshold_usec <= 0 || threshold_usec >= UINT64_MAX) if (threshold_usec <= 0 || threshold_usec >= UINT64_MAX)
@ -5364,15 +5467,15 @@ _public_ int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint
if (threshold_usec > window_usec) if (threshold_usec > window_usec)
return -EINVAL; return -EINVAL;
if (s->memory_pressure.locked) /* Refuse adjusting parameters, if caller told us how to watch for events */ if (s->pressure.locked) /* Refuse adjusting parameters, if caller told us how to watch for events */
return -EBUSY; return -EBUSY;
char* space = memchr(s->memory_pressure.write_buffer, ' ', s->memory_pressure.write_buffer_size); char* space = memchr(s->pressure.write_buffer, ' ', s->pressure.write_buffer_size);
if (!space) if (!space)
return -EINVAL; return -EINVAL;
size_t l = space - (char*) s->memory_pressure.write_buffer; size_t l = space - (char*) s->pressure.write_buffer;
b = memdup_suffix0(s->memory_pressure.write_buffer, l); b = memdup_suffix0(s->pressure.write_buffer, l);
if (!b) if (!b)
return -ENOMEM; return -ENOMEM;
if (!STR_IN_SET(b, "some", "full")) if (!STR_IN_SET(b, "some", "full"))
@ -5386,12 +5489,33 @@ _public_ int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint
return -EINVAL; return -EINVAL;
l = strlen(w) + 1; l = strlen(w) + 1;
if (memcmp_nn(s->memory_pressure.write_buffer, s->memory_pressure.write_buffer_size, w, l) == 0) if (memcmp_nn(s->pressure.write_buffer, s->pressure.write_buffer_size, w, l) == 0)
return 0; return 0;
free_and_replace(s->memory_pressure.write_buffer, w); free_and_replace(s->pressure.write_buffer, w);
s->memory_pressure.write_buffer_size = l; s->pressure.write_buffer_size = l;
s->memory_pressure.locked = false; s->pressure.locked = false;
return 1; return 1;
} }
_public_ int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_MEMORY_PRESSURE, -EDOM);
return event_source_set_pressure_period(s, threshold_usec, window_usec);
}
_public_ int sd_event_source_set_cpu_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_CPU_PRESSURE, -EDOM);
return event_source_set_pressure_period(s, threshold_usec, window_usec);
}
_public_ int sd_event_source_set_io_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec) {
assert_return(s, -EINVAL);
assert_return(s->type == SOURCE_IO_PRESSURE, -EDOM);
return event_source_set_pressure_period(s, threshold_usec, window_usec);
}

View File

@ -370,7 +370,7 @@ static Compression getenv_compression(void) {
if (r >= 0) if (r >= 0)
return r ? DEFAULT_COMPRESSION : COMPRESSION_NONE; return r ? DEFAULT_COMPRESSION : COMPRESSION_NONE;
c = compression_from_string(e); c = compression_from_string_harder(e);
if (c < 0) { if (c < 0) {
log_debug_errno(c, "Failed to parse SYSTEMD_JOURNAL_COMPRESS value, ignoring: %s", e); log_debug_errno(c, "Failed to parse SYSTEMD_JOURNAL_COMPRESS value, ignoring: %s", e);
return DEFAULT_COMPRESSION; return DEFAULT_COMPRESSION;

View File

@ -3880,8 +3880,6 @@ static int count_connection(sd_varlink_server *server, const struct ucred *ucred
assert(server); assert(server);
assert(ucred); assert(ucred);
server->n_connections++;
if (FLAGS_SET(server->flags, SD_VARLINK_SERVER_ACCOUNT_UID)) { if (FLAGS_SET(server->flags, SD_VARLINK_SERVER_ACCOUNT_UID)) {
assert(uid_is_valid(ucred->uid)); assert(uid_is_valid(ucred->uid));
@ -3899,6 +3897,8 @@ static int count_connection(sd_varlink_server *server, const struct ucred *ucred
return varlink_server_log_errno(server, r, "Failed to increment counter in UID hash table: %m"); return varlink_server_log_errno(server, r, "Failed to increment counter in UID hash table: %m");
} }
server->n_connections++;
return 0; return 0;
} }

View File

@ -524,7 +524,7 @@ static int portable_extract_by_path(
seq[0] = safe_close(seq[0]); seq[0] = safe_close(seq[0]);
errno_pipe_fd[0] = safe_close(errno_pipe_fd[0]); errno_pipe_fd[0] = safe_close(errno_pipe_fd[0]);
if (setns(CLONE_NEWUSER, userns_fd) < 0) { if (setns(userns_fd, CLONE_NEWUSER) < 0) {
r = log_debug_errno(errno, "Failed to join userns: %m"); r = log_debug_errno(errno, "Failed to join userns: %m");
report_errno_and_exit(errno_pipe_fd[1], r); report_errno_and_exit(errno_pipe_fd[1], r);
} }

View File

@ -2383,6 +2383,8 @@ static const BusProperty cgroup_properties[] = {
{ "ManagedOOMMemoryPressure", bus_append_string }, { "ManagedOOMMemoryPressure", bus_append_string },
{ "ManagedOOMPreference", bus_append_string }, { "ManagedOOMPreference", bus_append_string },
{ "MemoryPressureWatch", bus_append_string }, { "MemoryPressureWatch", bus_append_string },
{ "CPUPressureWatch", bus_append_string },
{ "IOPressureWatch", bus_append_string },
{ "DelegateSubgroup", bus_append_string }, { "DelegateSubgroup", bus_append_string },
{ "ManagedOOMMemoryPressureLimit", bus_append_parse_permyriad }, { "ManagedOOMMemoryPressureLimit", bus_append_parse_permyriad },
{ "MemoryAccounting", bus_append_parse_boolean }, { "MemoryAccounting", bus_append_parse_boolean },
@ -2421,6 +2423,8 @@ static const BusProperty cgroup_properties[] = {
{ "SocketBindAllow", bus_append_socket_filter }, { "SocketBindAllow", bus_append_socket_filter },
{ "SocketBindDeny", bus_append_socket_filter }, { "SocketBindDeny", bus_append_socket_filter },
{ "MemoryPressureThresholdSec", bus_append_parse_sec_rename }, { "MemoryPressureThresholdSec", bus_append_parse_sec_rename },
{ "CPUPressureThresholdSec", bus_append_parse_sec_rename },
{ "IOPressureThresholdSec", bus_append_parse_sec_rename },
{ "NFTSet", bus_append_nft_set }, { "NFTSet", bus_append_nft_set },
{ "BindNetworkInterface", bus_append_string }, { "BindNetworkInterface", bus_append_string },

View File

@ -64,6 +64,14 @@ static SD_VARLINK_DEFINE_STRUCT_TYPE(
SD_VARLINK_DEFINE_FIELD(DefaultMemoryPressureThresholdUSec, SD_VARLINK_INT, 0), SD_VARLINK_DEFINE_FIELD(DefaultMemoryPressureThresholdUSec, SD_VARLINK_INT, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultMemoryPressureWatch="), SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultMemoryPressureWatch="),
SD_VARLINK_DEFINE_FIELD(DefaultMemoryPressureWatch, SD_VARLINK_STRING, 0), SD_VARLINK_DEFINE_FIELD(DefaultMemoryPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultCPUPressureThresholdUSec="),
SD_VARLINK_DEFINE_FIELD(DefaultCPUPressureThresholdUSec, SD_VARLINK_INT, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultCPUPressureWatch="),
SD_VARLINK_DEFINE_FIELD(DefaultCPUPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultIOPressureThresholdUSec="),
SD_VARLINK_DEFINE_FIELD(DefaultIOPressureThresholdUSec, SD_VARLINK_INT, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#DefaultIOPressureWatch="),
SD_VARLINK_DEFINE_FIELD(DefaultIOPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#RuntimeWatchdogSec="), SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#RuntimeWatchdogSec="),
SD_VARLINK_DEFINE_FIELD(RuntimeWatchdogUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE), SD_VARLINK_DEFINE_FIELD(RuntimeWatchdogUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#RebootWatchdogSec="), SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd-system.conf.html#RebootWatchdogSec="),

View File

@ -228,6 +228,14 @@ static SD_VARLINK_DEFINE_STRUCT_TYPE(
SD_VARLINK_DEFINE_FIELD(MemoryPressureWatch, SD_VARLINK_STRING, 0), SD_VARLINK_DEFINE_FIELD(MemoryPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#MemoryPressureThresholdSec="), SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#MemoryPressureThresholdSec="),
SD_VARLINK_DEFINE_FIELD(MemoryPressureThresholdUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE), SD_VARLINK_DEFINE_FIELD(MemoryPressureThresholdUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#CPUPressureWatch="),
SD_VARLINK_DEFINE_FIELD(CPUPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#CPUPressureThresholdSec="),
SD_VARLINK_DEFINE_FIELD(CPUPressureThresholdUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#IOPressureWatch="),
SD_VARLINK_DEFINE_FIELD(IOPressureWatch, SD_VARLINK_STRING, 0),
SD_VARLINK_FIELD_COMMENT("https://www.freedesktop.org/software/systemd/man/"PROJECT_VERSION_STR"/systemd.resource-control.html#IOPressureThresholdSec="),
SD_VARLINK_DEFINE_FIELD(IOPressureThresholdUSec, SD_VARLINK_INT, SD_VARLINK_NULLABLE),
/* Others */ /* Others */
SD_VARLINK_FIELD_COMMENT("Reflects whether to forward coredumps for processes that crash within this cgroup"), SD_VARLINK_FIELD_COMMENT("Reflects whether to forward coredumps for processes that crash within this cgroup"),

View File

@ -97,6 +97,8 @@ int sd_event_add_defer(sd_event *e, sd_event_source **ret, sd_event_handler_t ca
int sd_event_add_post(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata); int sd_event_add_post(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata);
int sd_event_add_exit(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata); int sd_event_add_exit(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata);
int sd_event_add_memory_pressure(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata); int sd_event_add_memory_pressure(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata);
int sd_event_add_cpu_pressure(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata);
int sd_event_add_io_pressure(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata);
int sd_event_prepare(sd_event *e); int sd_event_prepare(sd_event *e);
int sd_event_wait(sd_event *e, uint64_t timeout); int sd_event_wait(sd_event *e, uint64_t timeout);
@ -162,6 +164,10 @@ int sd_event_source_get_inotify_mask(sd_event_source *s, uint32_t *ret);
int sd_event_source_get_inotify_path(sd_event_source *s, const char **ret); int sd_event_source_get_inotify_path(sd_event_source *s, const char **ret);
int sd_event_source_set_memory_pressure_type(sd_event_source *s, const char *ty); int sd_event_source_set_memory_pressure_type(sd_event_source *s, const char *ty);
int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec); int sd_event_source_set_memory_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec);
int sd_event_source_set_cpu_pressure_type(sd_event_source *s, const char *ty);
int sd_event_source_set_cpu_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec);
int sd_event_source_set_io_pressure_type(sd_event_source *s, const char *ty);
int sd_event_source_set_io_pressure_period(sd_event_source *s, uint64_t threshold_usec, uint64_t window_usec);
int sd_event_source_set_destroy_callback(sd_event_source *s, sd_event_destroy_t callback); int sd_event_source_set_destroy_callback(sd_event_source *s, sd_event_destroy_t callback);
int sd_event_source_get_destroy_callback(sd_event_source *s, sd_event_destroy_t *ret); int sd_event_source_get_destroy_callback(sd_event_source *s, sd_event_destroy_t *ret);
int sd_event_source_get_floating(sd_event_source *s); int sd_event_source_get_floating(sd_event_source *s);

View File

@ -368,7 +368,7 @@ executables += [
'dependencies' : libm, 'dependencies' : libm,
}, },
test_template + { test_template + {
'sources' : files('test-mempress.c'), 'sources' : files('test-pressure.c'),
'dependencies' : threads, 'dependencies' : threads,
}, },
test_template + { test_template + {

View File

@ -1,27 +1,36 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */ /* SPDX-License-Identifier: LGPL-2.1-or-later */
#include "alloc-util.h" #include "alloc-util.h"
#include "argv-util.h"
#include "compress.h" #include "compress.h"
#include "nulstr-util.h"
#include "parse-util.h" #include "parse-util.h"
#include "process-util.h" #include "process-util.h"
#include "random-util.h" #include "random-util.h"
#include "string-table.h"
#include "tests.h" #include "tests.h"
#include "time-util.h" #include "time-util.h"
typedef int (compress_t)(const void *src, uint64_t src_size, void *dst,
size_t dst_alloc_size, size_t *dst_size, int level);
typedef int (decompress_t)(const void *src, uint64_t src_size,
void **dst, size_t* dst_size, size_t dst_max);
#if HAVE_COMPRESSION
static usec_t arg_duration; static usec_t arg_duration;
static size_t arg_start; static size_t arg_start;
#define MAX_SIZE (1024*1024LU) #define MAX_SIZE (1024*1024LU)
#define PRIME 1048571 /* A prime close enough to one megabyte that mod 4 == 3 */ #define PRIME 1048571 /* A prime close enough to one megabyte that mod 4 == 3 */
typedef enum BenchmarkDataType {
BENCHMARK_DATA_ZEROS,
BENCHMARK_DATA_SIMPLE,
BENCHMARK_DATA_RANDOM,
_BENCHMARK_DATA_TYPE_MAX,
} BenchmarkDataType;
static const char* const benchmark_data_type_table[_BENCHMARK_DATA_TYPE_MAX] = {
[BENCHMARK_DATA_ZEROS] = "zeros",
[BENCHMARK_DATA_SIMPLE] = "simple",
[BENCHMARK_DATA_RANDOM] = "random",
};
DEFINE_PRIVATE_STRING_TABLE_LOOKUP_TO_STRING(benchmark_data_type, BenchmarkDataType);
static size_t _permute(size_t x) { static size_t _permute(size_t x) {
size_t residue; size_t residue;
@ -39,19 +48,24 @@ static size_t permute(size_t x) {
return _permute((_permute(x) + arg_start) % MAX_SIZE ^ 0xFF345); return _permute((_permute(x) + arg_start) % MAX_SIZE ^ 0xFF345);
} }
static char* make_buf(size_t count, const char *type) { static char* make_buf(size_t count, BenchmarkDataType type) {
char *buf; char *buf;
size_t i;
buf = malloc(count); buf = malloc(count);
assert_se(buf); ASSERT_NOT_NULL(buf);
if (streq(type, "zeros")) switch (type) {
case BENCHMARK_DATA_ZEROS:
memzero(buf, count); memzero(buf, count);
else if (streq(type, "simple")) break;
for (i = 0; i < count; i++)
case BENCHMARK_DATA_SIMPLE:
for (size_t i = 0; i < count; i++)
buf[i] = 'a' + i % ('z' - 'a' + 1); buf[i] = 'a' + i % ('z' - 'a' + 1);
else if (streq(type, "random")) { break;
case BENCHMARK_DATA_RANDOM: {
size_t step = count / 10; size_t step = count / 10;
random_bytes(buf, step); random_bytes(buf, step);
@ -64,110 +78,103 @@ static char* make_buf(size_t count, const char *type) {
memzero(buf + 7*step, step); memzero(buf + 7*step, step);
random_bytes(buf + 8*step, step); random_bytes(buf + 8*step, step);
memzero(buf + 9*step, step); memzero(buf + 9*step, step);
} else break;
}
default:
assert_not_reached(); assert_not_reached();
}
return buf; return buf;
} }
static void test_compress_decompress(const char* label, const char* type, TEST(benchmark) {
compress_t compress, decompress_t decompress) { for (BenchmarkDataType dt = 0; dt < _BENCHMARK_DATA_TYPE_MAX; dt++)
usec_t n, n2 = 0; for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
float dt; if (c == COMPRESSION_NONE || !compression_supported(c))
continue;
_cleanup_free_ char *text = NULL, *buf = NULL; const char *label = compression_to_string(c);
_cleanup_free_ void *buf2 = NULL; const char *type = benchmark_data_type_to_string(dt);
size_t skipped = 0, compressed = 0, total = 0; usec_t n, n2 = 0;
text = make_buf(MAX_SIZE, type); _cleanup_free_ char *text = NULL, *buf = NULL;
buf = calloc(MAX_SIZE + 1, 1); _cleanup_free_ void *buf2 = NULL;
assert_se(text && buf); size_t skipped = 0, compressed = 0, total = 0;
n = now(CLOCK_MONOTONIC); text = make_buf(MAX_SIZE, dt);
buf = calloc(MAX_SIZE + 1, 1);
ASSERT_NOT_NULL(text);
ASSERT_NOT_NULL(buf);
for (size_t i = 0; i <= MAX_SIZE; i++) { n = now(CLOCK_MONOTONIC);
size_t j = 0, k = 0, size;
int r;
size = permute(i); for (size_t i = 0; i <= MAX_SIZE; i++) {
if (size == 0) size_t j = 0, k = 0, size;
continue; int r;
log_debug("%s %zu %zu", type, i, size); size = permute(i);
if (size == 0)
continue;
memzero(buf, MIN(size + 1000, MAX_SIZE)); log_debug("%s %zu %zu", type, i, size);
r = compress(text, size, buf, size, &j, /* level= */ -1); memzero(buf, MIN(size + 1000, MAX_SIZE));
/* assume compression must be successful except for small or random inputs */
assert_se(r >= 0 || (size < 2048 && r == -ENOBUFS) || streq(type, "random"));
/* check for overwrites */ r = compress_blob(c, text, size, buf, size, &j, /* level= */ -1);
assert_se(buf[size] == 0); /* assume compression must be successful except for small or random inputs */
if (r < 0) { ASSERT_TRUE(r >= 0 || (size < 2048 && r == -ENOBUFS) || dt == BENCHMARK_DATA_RANDOM);
skipped += size;
continue; /* check for overwrites */
ASSERT_EQ(buf[size], 0);
if (r < 0) {
skipped += size;
continue;
}
ASSERT_TRUE(j > 0);
if (j >= size)
log_error("%s \"compressed\" %zu -> %zu", label, size, j);
ASSERT_OK_ZERO(decompress_blob(c, buf, j, &buf2, &k, 0));
ASSERT_EQ(k, size);
ASSERT_EQ(memcmp(text, buf2, size), 0);
total += size;
compressed += j;
n2 = now(CLOCK_MONOTONIC);
if (n2 - n > arg_duration)
break;
}
float elapsed = (n2-n) / 1e6;
log_info("%s/%s: compressed & decompressed %zu bytes in %.2fs (%.2fMiB/s), "
"mean compression %.2f%%, skipped %zu bytes",
label, type, total, elapsed,
total / 1024. / 1024 / elapsed,
100 - compressed * 100. / total,
skipped);
} }
assert_se(j > 0);
if (j >= size)
log_error("%s \"compressed\" %zu -> %zu", label, size, j);
r = decompress(buf, j, &buf2, &k, 0);
assert_se(r == 0);
assert_se(k == size);
assert_se(memcmp(text, buf2, size) == 0);
total += size;
compressed += j;
n2 = now(CLOCK_MONOTONIC);
if (n2 - n > arg_duration)
break;
}
dt = (n2-n) / 1e6;
log_info("%s/%s: compressed & decompressed %zu bytes in %.2fs (%.2fMiB/s), "
"mean compression %.2f%%, skipped %zu bytes",
label, type, total, dt,
total / 1024. / 1024 / dt,
100 - compressed * 100. / total,
skipped);
} }
#endif
int main(int argc, char *argv[]) { static int intro(void) {
#if HAVE_COMPRESSION if (saved_argc >= 2) {
test_setup_logging(LOG_INFO);
if (argc >= 2) {
unsigned x; unsigned x;
assert_se(safe_atou(argv[1], &x) >= 0); ASSERT_OK(safe_atou(saved_argv[1], &x));
arg_duration = x * USEC_PER_SEC; arg_duration = x * USEC_PER_SEC;
} else } else
arg_duration = slow_tests_enabled() ? arg_duration = slow_tests_enabled() ?
2 * USEC_PER_SEC : USEC_PER_SEC / 50; 2 * USEC_PER_SEC : USEC_PER_SEC / 50;
if (argc == 3) if (saved_argc == 3)
(void) safe_atozu(argv[2], &arg_start); (void) safe_atozu(saved_argv[2], &arg_start);
else else
arg_start = getpid_cached(); arg_start = getpid_cached();
NULSTR_FOREACH(i, "zeros\0simple\0random\0") {
#if HAVE_XZ
test_compress_decompress("XZ", i, compress_blob_xz, decompress_blob_xz);
#endif
#if HAVE_LZ4
test_compress_decompress("LZ4", i, compress_blob_lz4, decompress_blob_lz4);
#endif
#if HAVE_ZSTD
test_compress_decompress("ZSTD", i, compress_blob_zstd, decompress_blob_zstd);
#endif
}
return 0; return 0;
#else
return log_tests_skipped("No compression feature is enabled");
#endif
} }
DEFINE_TEST_MAIN_WITH_INTRO(LOG_INFO, intro);

View File

@ -4,13 +4,9 @@
#include <sys/stat.h> #include <sys/stat.h>
#include <unistd.h> #include <unistd.h>
#if HAVE_LZ4
#include <lz4.h>
#endif
#include "alloc-util.h" #include "alloc-util.h"
#include "argv-util.h"
#include "compress.h" #include "compress.h"
#include "dlfcn-util.h"
#include "fd-util.h" #include "fd-util.h"
#include "io-util.h" #include "io-util.h"
#include "path-util.h" #include "path-util.h"
@ -18,516 +14,455 @@
#include "tests.h" #include "tests.h"
#include "tmpfile-util.h" #include "tmpfile-util.h"
#if HAVE_XZ
# define XZ_OK 0
#else
# define XZ_OK -EPROTONOSUPPORT
#endif
#if HAVE_LZ4
# define LZ4_OK 0
#else
# define LZ4_OK -EPROTONOSUPPORT
#endif
#define HUGE_SIZE (4096*1024) #define HUGE_SIZE (4096*1024)
typedef int (compress_blob_t)(const void *src, uint64_t src_size, static const char text[] =
void *dst, size_t dst_alloc_size, size_t *dst_size, int level); "text\0foofoofoofoo AAAA aaaaaaaaa ghost busters barbarbar FFF"
typedef int (decompress_blob_t)(const void *src, uint64_t src_size, "foofoofoofoo AAAA aaaaaaaaa ghost busters barbarbar FFF";
void **dst, static char data[512] = "random\0";
size_t* dst_size, size_t dst_max); static char *huge = NULL;
typedef int (decompress_sw_t)(const void *src, uint64_t src_size, static const char *srcfile;
void **buffer,
const void *prefix, size_t prefix_len,
uint8_t extra);
typedef int (compress_stream_t)(int fdf, int fdt, uint64_t max_bytes, uint64_t *uncompressed_size); static const char* cat_for_compression(Compression c) {
typedef int (decompress_stream_t)(int fdf, int fdt, uint64_t max_size); switch (c) {
case COMPRESSION_XZ: return "xzcat";
#if HAVE_COMPRESSION case COMPRESSION_LZ4: return "lz4cat";
_unused_ static void test_compress_decompress( case COMPRESSION_ZSTD: return "zstdcat";
const char *compression, case COMPRESSION_GZIP: return "zcat";
compress_blob_t compress, case COMPRESSION_BZIP2: return "bzcat";
decompress_blob_t decompress, default: return NULL;
const char *data,
size_t data_len,
bool may_fail) {
char compressed[512];
size_t csize;
_cleanup_free_ char *decompressed = NULL;
int r;
log_info("/* testing %s %s blob compression/decompression */",
compression, data);
r = compress(data, data_len, compressed, sizeof(compressed), &csize, /* level= */ -1);
if (r == -ENOBUFS) {
log_info_errno(r, "compression failed: %m");
assert_se(may_fail);
} else {
assert_se(r >= 0);
r = decompress(compressed, csize,
(void **) &decompressed, &csize, 0);
assert_se(r == 0);
assert_se(decompressed);
assert_se(memcmp(decompressed, data, data_len) == 0);
} }
r = decompress("garbage", 7,
(void **) &decompressed, &csize, 0);
assert_se(r < 0);
/* make sure to have the minimal lz4 compressed size */
r = decompress("00000000\1g", 9,
(void **) &decompressed, &csize, 0);
assert_se(r < 0);
r = decompress("\100000000g", 9,
(void **) &decompressed, &csize, 0);
assert_se(r < 0);
explicit_bzero_safe(decompressed, MALLOC_SIZEOF_SAFE(decompressed));
} }
_unused_ static void test_decompress_startswith(const char *compression, TEST(compress_decompress_blob) {
compress_blob_t compress, for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
decompress_sw_t decompress_sw, if (c == COMPRESSION_NONE || !compression_supported(c))
const char *data, continue;
size_t data_len,
bool may_fail) {
char *compressed; const char *label = compression_to_string(c);
_cleanup_free_ char *compressed1 = NULL, *compressed2 = NULL, *decompressed = NULL;
size_t csize, len;
int r;
log_info("/* testing decompress_startswith with %s on %.20s text */", for (size_t t = 0; t < 2; t++) {
compression, data); const char *input = t == 0 ? text : data;
size_t input_len = t == 0 ? sizeof(text) : sizeof(data);
bool may_fail = t == 1;
#define BUFSIZE_1 512 char compressed[512];
#define BUFSIZE_2 20000 size_t csize;
_cleanup_free_ char *decompressed = NULL;
int r;
compressed = compressed1 = malloc(BUFSIZE_1); log_info("/* testing %s %s blob compression/decompression */", label, input);
assert_se(compressed1);
r = compress(data, data_len, compressed, BUFSIZE_1, &csize, /* level= */ -1);
if (r == -ENOBUFS) {
log_info_errno(r, "compression failed: %m");
assert_se(may_fail);
compressed = compressed2 = malloc(BUFSIZE_2); r = compress_blob(c, input, input_len, compressed, sizeof(compressed), &csize, -1);
assert_se(compressed2); if (r == -ENOBUFS) {
r = compress(data, data_len, compressed, BUFSIZE_2, &csize, /* level= */ -1); log_info_errno(r, "compression failed: %m");
ASSERT_TRUE(may_fail);
} else {
ASSERT_OK(r);
ASSERT_OK_ZERO(decompress_blob(c, compressed, csize, (void **) &decompressed, &csize, 0));
ASSERT_NOT_NULL(decompressed);
ASSERT_EQ(memcmp(decompressed, input, input_len), 0);
}
ASSERT_FAIL(decompress_blob(c, "garbage", 7, (void **) &decompressed, &csize, 0));
}
} }
assert_se(r >= 0);
len = strlen(data);
r = decompress_sw(compressed, csize, (void **) &decompressed, data, len, '\0');
assert_se(r > 0);
r = decompress_sw(compressed, csize, (void **) &decompressed, data, len, 'w');
assert_se(r == 0);
r = decompress_sw(compressed, csize, (void **) &decompressed, "barbarbar", 9, ' ');
assert_se(r == 0);
r = decompress_sw(compressed, csize, (void **) &decompressed, data, len - 1, data[len-1]);
assert_se(r > 0);
r = decompress_sw(compressed, csize, (void **) &decompressed, data, len - 1, 'w');
assert_se(r == 0);
r = decompress_sw(compressed, csize, (void **) &decompressed, data, len, '\0');
assert_se(r > 0);
} }
_unused_ static void test_decompress_startswith_short(const char *compression, TEST(decompress_startswith) {
compress_blob_t compress, for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
decompress_sw_t decompress_sw) { if (c == COMPRESSION_NONE || !compression_supported(c))
continue;
const char *label = compression_to_string(c);
struct { const char *buf; size_t len; bool may_fail; } inputs[] = {
{ text, sizeof(text), false },
{ data, sizeof(data), true },
{ huge, HUGE_SIZE, true },
};
for (size_t t = 0; t < ELEMENTSOF(inputs); t++) {
char *compressed;
_cleanup_free_ char *compressed1 = NULL, *compressed2 = NULL, *decompressed = NULL;
size_t csize, len;
int r;
log_info("/* testing decompress_startswith with %s on %.20s */", label, inputs[t].buf);
compressed = compressed1 = malloc(512);
ASSERT_NOT_NULL(compressed1);
r = compress_blob(c, inputs[t].buf, inputs[t].len, compressed, 512, &csize, -1);
if (r == -ENOBUFS) {
log_info_errno(r, "compression failed: %m");
ASSERT_TRUE(inputs[t].may_fail);
compressed = compressed2 = malloc(20000);
ASSERT_NOT_NULL(compressed2);
r = compress_blob(c, inputs[t].buf, inputs[t].len, compressed, 20000, &csize, -1);
}
if (r == -ENOBUFS) {
log_info_errno(r, "compression failed again: %m");
ASSERT_TRUE(inputs[t].may_fail);
continue;
}
ASSERT_OK(r);
len = strlen(inputs[t].buf);
ASSERT_OK_POSITIVE(decompress_startswith(c, compressed, csize, (void **) &decompressed, inputs[t].buf, len, '\0'));
ASSERT_OK_ZERO(decompress_startswith(c, compressed, csize, (void **) &decompressed, inputs[t].buf, len, 'w'));
ASSERT_OK_POSITIVE(decompress_startswith(c, compressed, csize, (void **) &decompressed, inputs[t].buf, len - 1, inputs[t].buf[len-1]));
ASSERT_OK_ZERO(decompress_startswith(c, compressed, csize, (void **) &decompressed, inputs[t].buf, len - 1, 'w'));
}
}
}
TEST(decompress_startswith_large) {
/* Test decompress_startswith with large data to exercise the buffer growth path. */
_cleanup_free_ char *large = NULL;
size_t large_size = 8 * 1024;
ASSERT_NOT_NULL(large = malloc(large_size));
for (size_t i = 0; i < large_size; i++)
large[i] = 'A' + (i % 26);
for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
if (c == COMPRESSION_NONE || !compression_supported(c))
continue;
_cleanup_free_ char *compressed = NULL;
size_t csize;
log_info("/* decompress_startswith_large with %s */", compression_to_string(c));
ASSERT_NOT_NULL(compressed = malloc(large_size));
int r = compress_blob(c, large, large_size, compressed, large_size, &csize, -1);
if (r == -ENOBUFS) {
log_info_errno(r, "compression failed: %m");
continue;
}
ASSERT_OK(r);
_cleanup_free_ void *buf = NULL;
ASSERT_OK_POSITIVE(decompress_startswith(c, compressed, csize, &buf, large, 1, large[1]));
ASSERT_OK_ZERO(decompress_startswith(c, compressed, csize, &buf, large, 1, 0xff));
ASSERT_OK_POSITIVE(decompress_startswith(c, compressed, csize, &buf, large, 512, large[512]));
ASSERT_OK_ZERO(decompress_startswith(c, compressed, csize, &buf, large, 512, 0xff));
ASSERT_OK_POSITIVE(decompress_startswith(c, compressed, csize, &buf, large, 4096, large[4096]));
ASSERT_OK_ZERO(decompress_startswith(c, compressed, csize, &buf, large, 4096, 0xff));
}
}
TEST(decompress_startswith_short) {
#define TEXT "HUGE=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" #define TEXT "HUGE=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
char buf[1024]; for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
size_t csize; if (c == COMPRESSION_NONE || !compression_supported(c))
int r; continue;
log_info("/* %s with %s */", __func__, compression); char buf[1024];
size_t csize;
r = compress(TEXT, sizeof TEXT, buf, sizeof buf, &csize, /* level= */ -1); log_info("/* decompress_startswith_short with %s */", compression_to_string(c));
assert_se(r >= 0);
for (size_t i = 1; i < strlen(TEXT); i++) { ASSERT_OK(compress_blob(c, TEXT, sizeof TEXT, buf, sizeof buf, &csize, -1));
_cleanup_free_ void *buf2 = NULL;
assert_se(buf2 = malloc(i)); for (size_t i = 1; i < strlen(TEXT); i++) {
_cleanup_free_ void *buf2 = NULL;
assert_se(decompress_sw(buf, csize, &buf2, TEXT, i, TEXT[i]) == 1); ASSERT_NOT_NULL(buf2 = malloc(i));
assert_se(decompress_sw(buf, csize, &buf2, TEXT, i, 'y') == 0);
ASSERT_OK_POSITIVE(decompress_startswith(c, buf, csize, &buf2, TEXT, i, TEXT[i]));
ASSERT_OK_ZERO(decompress_startswith(c, buf, csize, &buf2, TEXT, i, 'y'));
}
} }
#undef TEXT
} }
_unused_ static void test_compress_stream(const char *compression, TEST(compress_decompress_stream) {
const char *cat, for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
compress_stream_t compress, if (c == COMPRESSION_NONE || !compression_supported(c))
decompress_stream_t decompress, continue;
const char *srcfile) {
_cleanup_close_ int src = -EBADF, dst = -EBADF, dst2 = -EBADF; const char *cat = cat_for_compression(c);
_cleanup_(unlink_tempfilep) char if (!cat)
pattern[] = "/tmp/systemd-test.compressed.XXXXXX", continue;
pattern2[] = "/tmp/systemd-test.compressed.XXXXXX";
int r;
_cleanup_free_ char *cmd = NULL, *cmd2 = NULL;
struct stat st = {};
uint64_t uncompressed_size;
r = find_executable(cat, NULL); int r = find_executable(cat, NULL);
if (r < 0) { if (r < 0) {
log_error_errno(r, "Skipping %s, could not find %s binary: %m", __func__, cat); log_error_errno(r, "Skipping %s, could not find %s binary: %m",
return; compression_to_string(c), cat);
} continue;
}
log_debug("/* testing %s compression */", compression); _cleanup_close_ int src = -EBADF, dst = -EBADF, dst2 = -EBADF;
log_debug("/* create source from %s */", srcfile);
ASSERT_OK(src = open(srcfile, O_RDONLY|O_CLOEXEC));
log_debug("/* test compression */");
assert_se((dst = mkostemp_safe(pattern)) >= 0);
ASSERT_OK(compress(src, dst, -1, &uncompressed_size));
if (cat) {
assert_se(asprintf(&cmd, "%s %s | diff '%s' -", cat, pattern, srcfile) > 0);
assert_se(system(cmd) == 0);
}
log_debug("/* test decompression */");
assert_se((dst2 = mkostemp_safe(pattern2)) >= 0);
assert_se(stat(srcfile, &st) == 0);
assert_se((uint64_t)st.st_size == uncompressed_size);
assert_se(lseek(dst, 0, SEEK_SET) == 0);
r = decompress(dst, dst2, st.st_size);
assert_se(r == 0);
assert_se(asprintf(&cmd2, "diff '%s' %s", srcfile, pattern2) > 0);
assert_se(system(cmd2) == 0);
log_debug("/* test faulty decompression */");
assert_se(lseek(dst, 1, SEEK_SET) == 1);
r = decompress(dst, dst2, st.st_size);
assert_se(IN_SET(r, 0, -EBADMSG));
assert_se(lseek(dst, 0, SEEK_SET) == 0);
assert_se(lseek(dst2, 0, SEEK_SET) == 0);
r = decompress(dst, dst2, st.st_size - 1);
assert_se(r == -EFBIG);
}
_unused_ static void test_decompress_stream_sparse(const char *compression,
compress_stream_t compress,
decompress_stream_t decompress) {
_cleanup_close_ int src = -EBADF, compressed = -EBADF, decompressed = -EBADF;
_cleanup_(unlink_tempfilep) char
pattern_src[] = "/tmp/systemd-test.sparse-src.XXXXXX",
pattern_compressed[] = "/tmp/systemd-test.sparse-compressed.XXXXXX",
pattern_decompressed[] = "/tmp/systemd-test.sparse-decompressed.XXXXXX";
/* Create a sparse-like input: 4K of data, 64K of zeros, 4K of data, 64K trailing zeros.
* Total apparent size: 136K, but most of it is zeros. */
uint8_t data_block[4096];
struct stat st_src, st_decompressed;
uint64_t uncompressed_size;
int r;
assert(compression);
log_debug("/* testing %s sparse decompression */", compression);
random_bytes(data_block, sizeof(data_block));
assert_se((src = mkostemp_safe(pattern_src)) >= 0);
/* Write: 4K data, 64K zeros, 4K data, 64K zeros */
assert_se(loop_write(src, data_block, sizeof(data_block)) >= 0);
assert_se(ftruncate(src, sizeof(data_block) + 65536) >= 0);
assert_se(lseek(src, sizeof(data_block) + 65536, SEEK_SET) >= 0);
assert_se(loop_write(src, data_block, sizeof(data_block)) >= 0);
assert_se(ftruncate(src, 2 * sizeof(data_block) + 2 * 65536) >= 0);
assert_se(lseek(src, 0, SEEK_SET) == 0);
assert_se(fstat(src, &st_src) >= 0);
assert_se(st_src.st_size == 2 * (off_t) sizeof(data_block) + 2 * 65536);
/* Compress */
assert_se((compressed = mkostemp_safe(pattern_compressed)) >= 0);
ASSERT_OK(compress(src, compressed, -1, &uncompressed_size));
assert_se((uint64_t) st_src.st_size == uncompressed_size);
/* Decompress to a regular file (sparse writes auto-detected) */
assert_se((decompressed = mkostemp_safe(pattern_decompressed)) >= 0);
assert_se(lseek(compressed, 0, SEEK_SET) == 0);
r = decompress(compressed, decompressed, st_src.st_size);
assert_se(r == 0);
/* Verify apparent size matches */
assert_se(fstat(decompressed, &st_decompressed) >= 0);
assert_se(st_decompressed.st_size == st_src.st_size);
/* Verify content matches by comparing bytes */
assert_se(lseek(src, 0, SEEK_SET) == 0);
assert_se(lseek(decompressed, 0, SEEK_SET) == 0);
for (off_t offset = 0; offset < st_src.st_size;) {
uint8_t buf_src[4096], buf_dst[4096];
size_t to_read = MIN((size_t) (st_src.st_size - offset), sizeof(buf_src));
ssize_t n;
n = loop_read(src, buf_src, to_read, true);
assert_se(n == (ssize_t) to_read);
n = loop_read(decompressed, buf_dst, to_read, true);
assert_se(n == (ssize_t) to_read);
assert_se(memcmp(buf_src, buf_dst, to_read) == 0);
offset += to_read;
}
/* Verify the decompressed file is actually sparse (uses less disk than apparent size).
* st_blocks is in 512-byte units. The file has 128K of zeros, so disk usage should be
* noticeably less than the apparent size if sparse writes worked.
* Only assert if the filesystem supports holes (SEEK_HOLE). */
log_debug("%s sparse decompression: apparent=%jd disk=%jd",
compression,
(intmax_t) st_decompressed.st_size,
(intmax_t) st_decompressed.st_blocks * 512);
if (lseek(decompressed, 0, SEEK_HOLE) < st_decompressed.st_size)
assert_se(st_decompressed.st_blocks * 512 < st_decompressed.st_size);
else
log_debug("Filesystem does not support holes, skipping sparsity check");
/* Test all-zeros input: entire output should be a hole */
log_debug("/* testing %s sparse decompression of all-zeros */", compression);
{
_cleanup_close_ int zsrc = -EBADF, zcompressed = -EBADF, zdecompressed = -EBADF;
_cleanup_(unlink_tempfilep) char _cleanup_(unlink_tempfilep) char
zp_src[] = "/tmp/systemd-test.sparse-zero-src.XXXXXX", pattern[] = "/tmp/systemd-test.compressed.XXXXXX",
zp_compressed[] = "/tmp/systemd-test.sparse-zero-compressed.XXXXXX", pattern2[] = "/tmp/systemd-test.compressed.XXXXXX";
zp_decompressed[] = "/tmp/systemd-test.sparse-zero-decompressed.XXXXXX"; _cleanup_free_ char *cmd = NULL, *cmd2 = NULL;
struct stat zst; struct stat st = {};
uint64_t zsize; uint64_t uncompressed_size;
uint8_t zeros[65536] = {};
assert_se((zsrc = mkostemp_safe(zp_src)) >= 0); log_debug("/* testing %s stream compression */", compression_to_string(c));
assert_se(loop_write(zsrc, zeros, sizeof(zeros)) >= 0);
assert_se(lseek(zsrc, 0, SEEK_SET) == 0);
assert_se((zcompressed = mkostemp_safe(zp_compressed)) >= 0); ASSERT_OK(src = open(srcfile, O_RDONLY|O_CLOEXEC));
ASSERT_OK(compress(zsrc, zcompressed, -1, &zsize)); ASSERT_OK(dst = mkostemp_safe(pattern));
assert_se(zsize == sizeof(zeros));
assert_se((zdecompressed = mkostemp_safe(zp_decompressed)) >= 0); ASSERT_OK(compress_stream(c, src, dst, -1, &uncompressed_size));
assert_se(lseek(zcompressed, 0, SEEK_SET) == 0);
assert_se(decompress(zcompressed, zdecompressed, sizeof(zeros)) == 0);
assert_se(fstat(zdecompressed, &zst) >= 0); ASSERT_OK_POSITIVE(asprintf(&cmd, "%s %s | diff '%s' -", cat, pattern, srcfile));
assert_se(zst.st_size == (off_t) sizeof(zeros)); ASSERT_OK_ZERO(system(cmd));
/* All zeros — disk usage should be minimal */
log_debug("%s all-zeros sparse: apparent=%jd disk=%jd", ASSERT_OK(dst2 = mkostemp_safe(pattern2));
compression, (intmax_t) zst.st_size, (intmax_t) zst.st_blocks * 512);
if (lseek(zdecompressed, 0, SEEK_HOLE) < zst.st_size) ASSERT_OK_ZERO_ERRNO(stat(srcfile, &st));
assert_se(zst.st_blocks * 512 < zst.st_size); ASSERT_EQ((uint64_t) st.st_size, uncompressed_size);
ASSERT_OK_ERRNO(lseek(dst, 0, SEEK_SET));
ASSERT_OK_ZERO(decompress_stream(c, dst, dst2, st.st_size));
ASSERT_OK_POSITIVE(asprintf(&cmd2, "diff '%s' %s", srcfile, pattern2));
ASSERT_OK_ZERO(system(cmd2));
log_debug("/* test faulty decompression */");
ASSERT_OK_ERRNO(lseek(dst, 1, SEEK_SET));
r = decompress_stream(c, dst, dst2, st.st_size);
ASSERT_TRUE(IN_SET(r, 0, -EBADMSG));
ASSERT_OK_ERRNO(lseek(dst, 0, SEEK_SET));
ASSERT_OK_ERRNO(lseek(dst2, 0, SEEK_SET));
ASSERT_ERROR(decompress_stream(c, dst, dst2, st.st_size - 1), EFBIG);
}
}
struct decompressor_test_data {
uint8_t *buf;
size_t size;
};
static int test_decompressor_callback(const void *p, size_t size, void *userdata) {
struct decompressor_test_data *d = ASSERT_PTR(userdata);
if (!GREEDY_REALLOC(d->buf, d->size + size))
return -ENOMEM;
memcpy(d->buf + d->size, p, size);
d->size += size;
return 0;
}
TEST(decompress_stream_sparse) {
for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
if (c == COMPRESSION_NONE || !compression_supported(c))
continue;
_cleanup_close_ int src = -EBADF, compressed = -EBADF, decompressed = -EBADF;
_cleanup_(unlink_tempfilep) char
pattern_src[] = "/tmp/systemd-test.sparse-src.XXXXXX",
pattern_compressed[] = "/tmp/systemd-test.sparse-compressed.XXXXXX",
pattern_decompressed[] = "/tmp/systemd-test.sparse-decompressed.XXXXXX";
/* Create a sparse-like input: 4K of data, 64K of zeros, 4K of data, 64K trailing zeros.
* Total apparent size: 136K, but most of it is zeros. */
uint8_t data_block[4096];
struct stat st_src, st_decompressed;
uint64_t uncompressed_size;
log_debug("/* testing %s sparse decompression */", compression_to_string(c));
random_bytes(data_block, sizeof(data_block));
ASSERT_OK(src = mkostemp_safe(pattern_src));
/* Write: 4K data, 64K zeros, 4K data, 64K zeros */
ASSERT_OK(loop_write(src, data_block, sizeof(data_block)));
ASSERT_OK_ERRNO(ftruncate(src, sizeof(data_block) + 65536));
ASSERT_OK_ERRNO(lseek(src, sizeof(data_block) + 65536, SEEK_SET));
ASSERT_OK(loop_write(src, data_block, sizeof(data_block)));
ASSERT_OK_ERRNO(ftruncate(src, 2 * sizeof(data_block) + 2 * 65536));
ASSERT_EQ(lseek(src, 0, SEEK_SET), (off_t) 0);
ASSERT_OK_ERRNO(fstat(src, &st_src));
ASSERT_EQ(st_src.st_size, 2 * (off_t) sizeof(data_block) + 2 * 65536);
/* Compress */
ASSERT_OK(compressed = mkostemp_safe(pattern_compressed));
ASSERT_OK(compress_stream(c, src, compressed, -1, &uncompressed_size));
ASSERT_EQ((uint64_t) st_src.st_size, uncompressed_size);
/* Decompress to a regular file (sparse writes auto-detected) */
ASSERT_OK(decompressed = mkostemp_safe(pattern_decompressed));
ASSERT_EQ(lseek(compressed, 0, SEEK_SET), (off_t) 0);
ASSERT_OK_ZERO(decompress_stream(c, compressed, decompressed, st_src.st_size));
/* Verify apparent size matches */
ASSERT_OK_ERRNO(fstat(decompressed, &st_decompressed));
ASSERT_EQ(st_decompressed.st_size, st_src.st_size);
/* Verify content matches by comparing bytes */
ASSERT_EQ(lseek(src, 0, SEEK_SET), (off_t) 0);
ASSERT_EQ(lseek(decompressed, 0, SEEK_SET), (off_t) 0);
for (off_t offset = 0; offset < st_src.st_size;) {
uint8_t buf_src[4096], buf_dst[4096];
size_t to_read = MIN((size_t) (st_src.st_size - offset), sizeof(buf_src));
ASSERT_EQ(loop_read(src, buf_src, to_read, true), (ssize_t) to_read);
ASSERT_EQ(loop_read(decompressed, buf_dst, to_read, true), (ssize_t) to_read);
ASSERT_EQ(memcmp(buf_src, buf_dst, to_read), 0);
offset += to_read;
}
/* Verify the decompressed file is actually sparse (uses less disk than apparent size).
* st_blocks is in 512-byte units. The file has 128K of zeros, so disk usage should be
* noticeably less than the apparent size if sparse writes worked.
* Only assert if the filesystem supports holes (SEEK_HOLE). */
log_debug("%s sparse decompression: apparent=%jd disk=%jd",
compression_to_string(c),
(intmax_t) st_decompressed.st_size,
(intmax_t) st_decompressed.st_blocks * 512);
if (lseek(decompressed, 0, SEEK_HOLE) < st_decompressed.st_size)
ASSERT_LT(st_decompressed.st_blocks * 512, st_decompressed.st_size);
else else
log_debug("Filesystem does not support holes, skipping sparsity check"); log_debug("Filesystem does not support holes, skipping sparsity check");
}
/* Test data ending with non-zero bytes: ftruncate should be a no-op */ /* Test all-zeros input: entire output should be a hole */
log_debug("/* testing %s sparse decompression ending with data */", compression); log_debug("/* testing %s sparse decompression of all-zeros */", compression_to_string(c));
{ {
_cleanup_close_ int dsrc = -EBADF, dcompressed = -EBADF, ddecompressed = -EBADF; _cleanup_close_ int zsrc = -EBADF, zcompressed = -EBADF, zdecompressed = -EBADF;
_cleanup_(unlink_tempfilep) char _cleanup_(unlink_tempfilep) char
dp_src[] = "/tmp/systemd-test.sparse-end-src.XXXXXX", zp_src[] = "/tmp/systemd-test.sparse-zero-src.XXXXXX",
dp_compressed[] = "/tmp/systemd-test.sparse-end-compressed.XXXXXX", zp_compressed[] = "/tmp/systemd-test.sparse-zero-compressed.XXXXXX",
dp_decompressed[] = "/tmp/systemd-test.sparse-end-decompressed.XXXXXX"; zp_decompressed[] = "/tmp/systemd-test.sparse-zero-decompressed.XXXXXX";
struct stat dst; struct stat zst;
uint64_t dsize; uint64_t zsize;
uint8_t zeros[65536] = {}; uint8_t zeros[65536] = {};
/* 64K zeros followed by 4K random data */ ASSERT_OK(zsrc = mkostemp_safe(zp_src));
assert_se((dsrc = mkostemp_safe(dp_src)) >= 0); ASSERT_OK(loop_write(zsrc, zeros, sizeof(zeros)));
assert_se(loop_write(dsrc, zeros, sizeof(zeros)) >= 0); ASSERT_EQ(lseek(zsrc, 0, SEEK_SET), (off_t) 0);
assert_se(loop_write(dsrc, data_block, sizeof(data_block)) >= 0);
assert_se(lseek(dsrc, 0, SEEK_SET) == 0);
assert_se((dcompressed = mkostemp_safe(dp_compressed)) >= 0); ASSERT_OK(zcompressed = mkostemp_safe(zp_compressed));
ASSERT_OK(compress(dsrc, dcompressed, -1, &dsize)); ASSERT_OK(compress_stream(c, zsrc, zcompressed, -1, &zsize));
assert_se(dsize == sizeof(zeros) + sizeof(data_block)); ASSERT_EQ(zsize, (uint64_t) sizeof(zeros));
assert_se((ddecompressed = mkostemp_safe(dp_decompressed)) >= 0); ASSERT_OK(zdecompressed = mkostemp_safe(zp_decompressed));
assert_se(lseek(dcompressed, 0, SEEK_SET) == 0); ASSERT_EQ(lseek(zcompressed, 0, SEEK_SET), (off_t) 0);
assert_se(decompress(dcompressed, ddecompressed, dsize) == 0); ASSERT_OK_ZERO(decompress_stream(c, zcompressed, zdecompressed, sizeof(zeros)));
assert_se(fstat(ddecompressed, &dst) >= 0); ASSERT_OK_ERRNO(fstat(zdecompressed, &zst));
assert_se(dst.st_size == (off_t)(sizeof(zeros) + sizeof(data_block))); ASSERT_EQ(zst.st_size, (off_t) sizeof(zeros));
/* All zeros — disk usage should be minimal */
log_debug("%s all-zeros sparse: apparent=%jd disk=%jd",
compression_to_string(c), (intmax_t) zst.st_size, (intmax_t) zst.st_blocks * 512);
if (lseek(zdecompressed, 0, SEEK_HOLE) < zst.st_size)
ASSERT_LT(zst.st_blocks * 512, zst.st_size);
else
log_debug("Filesystem does not support holes, skipping sparsity check");
}
/* Test data ending with non-zero bytes: ftruncate should be a no-op */
log_debug("/* testing %s sparse decompression ending with data */", compression_to_string(c));
{
_cleanup_close_ int dsrc = -EBADF, dcompressed = -EBADF, ddecompressed = -EBADF;
_cleanup_(unlink_tempfilep) char
dp_src[] = "/tmp/systemd-test.sparse-end-src.XXXXXX",
dp_compressed[] = "/tmp/systemd-test.sparse-end-compressed.XXXXXX",
dp_decompressed[] = "/tmp/systemd-test.sparse-end-decompressed.XXXXXX";
struct stat dst;
uint64_t dsize;
uint8_t zeros[65536] = {};
/* 64K zeros followed by 4K random data */
ASSERT_OK(dsrc = mkostemp_safe(dp_src));
ASSERT_OK(loop_write(dsrc, zeros, sizeof(zeros)));
ASSERT_OK(loop_write(dsrc, data_block, sizeof(data_block)));
ASSERT_EQ(lseek(dsrc, 0, SEEK_SET), (off_t) 0);
ASSERT_OK(dcompressed = mkostemp_safe(dp_compressed));
ASSERT_OK(compress_stream(c, dsrc, dcompressed, -1, &dsize));
ASSERT_EQ(dsize, (uint64_t)(sizeof(zeros) + sizeof(data_block)));
ASSERT_OK(ddecompressed = mkostemp_safe(dp_decompressed));
ASSERT_EQ(lseek(dcompressed, 0, SEEK_SET), (off_t) 0);
ASSERT_OK_ZERO(decompress_stream(c, dcompressed, ddecompressed, dsize));
ASSERT_OK_ERRNO(fstat(ddecompressed, &dst));
ASSERT_EQ(dst.st_size, (off_t)(sizeof(zeros) + sizeof(data_block)));
}
} }
} }
#endif
#if HAVE_LZ4 TEST(compressor_decompressor_push_api) {
extern DLSYM_PROTOTYPE(LZ4_compress_default); for (Compression c = 0; c < _COMPRESSION_MAX; c++) {
extern DLSYM_PROTOTYPE(LZ4_decompress_safe); if (c == COMPRESSION_NONE || !compression_supported(c))
extern DLSYM_PROTOTYPE(LZ4_decompress_safe_partial); continue;
extern DLSYM_PROTOTYPE(LZ4_versionNumber);
static void test_lz4_decompress_partial(void) { log_info("/* testing %s Compressor/Decompressor push API */", compression_to_string(c));
char buf[20000], buf2[100];
size_t buf_size = sizeof(buf), compressed;
int r;
_cleanup_free_ char *huge = NULL;
log_debug("/* %s */", __func__); _cleanup_(compressor_freep) Compressor *compressor = NULL;
_cleanup_(compressor_freep) Decompressor *decompressor = NULL;
_cleanup_free_ void *compressed = NULL, *finish_buf = NULL;
size_t compressed_size = 0, compressed_alloc = 0;
size_t finish_size = 0, finish_alloc = 0;
assert_se(huge = malloc(HUGE_SIZE)); /* Compress */
ASSERT_OK(compressor_new(&compressor, c));
ASSERT_EQ(compressor_type(compressor), c);
ASSERT_OK(compressor_start(compressor, text, sizeof(text), &compressed, &compressed_size, &compressed_alloc));
ASSERT_OK(compressor_finish(compressor, &finish_buf, &finish_size, &finish_alloc));
size_t total_compressed = compressed_size + finish_size;
_cleanup_free_ void *full_compressed = malloc(total_compressed);
ASSERT_NOT_NULL(full_compressed);
memcpy(full_compressed, compressed, compressed_size);
if (finish_size > 0)
memcpy((uint8_t*) full_compressed + compressed_size, finish_buf, finish_size);
compressor = compressor_free(compressor);
/* Decompress via detect + push and verify content */
ASSERT_OK_POSITIVE(decompressor_detect(&decompressor, full_compressed, total_compressed));
ASSERT_EQ(compressor_type(decompressor), c);
struct decompressor_test_data result = {};
ASSERT_OK(decompressor_push(decompressor, full_compressed, total_compressed, test_decompressor_callback, &result));
ASSERT_EQ(result.size, sizeof(text));
ASSERT_EQ(memcmp(result.buf, text, sizeof(text)), 0);
free(result.buf);
decompressor = compressor_free(decompressor);
}
/* Test compressor_type on NULL */
ASSERT_EQ(compressor_type(NULL), _COMPRESSION_INVALID);
/* Test decompressor_force_off */
_cleanup_(compressor_freep) Decompressor *d = NULL;
ASSERT_OK(decompressor_force_off(&d));
ASSERT_EQ(compressor_type(d), COMPRESSION_NONE);
d = compressor_free(d);
/* Test decompressor_detect returning 0 on too-small input */
ASSERT_OK_ZERO(decompressor_detect(&d, "x", 1));
ASSERT_NULL(d);
}
static int intro(void) {
srcfile = saved_argc > 1 ? saved_argv[1] : saved_argv[0];
ASSERT_NOT_NULL(huge = malloc(HUGE_SIZE));
memcpy(huge, "HUGE=", STRLEN("HUGE=")); memcpy(huge, "HUGE=", STRLEN("HUGE="));
memset(&huge[STRLEN("HUGE=")], 'x', HUGE_SIZE - STRLEN("HUGE=") - 1); memset(&huge[STRLEN("HUGE=")], 'x', HUGE_SIZE - STRLEN("HUGE=") - 1);
huge[HUGE_SIZE - 1] = '\0'; huge[HUGE_SIZE - 1] = '\0';
r = sym_LZ4_compress_default(huge, buf, HUGE_SIZE, buf_size);
assert_se(r >= 0);
compressed = r;
log_info("Compressed %i → %zu", HUGE_SIZE, compressed);
r = sym_LZ4_decompress_safe(buf, huge, r, HUGE_SIZE);
assert_se(r >= 0);
log_info("Decompressed → %i", r);
r = sym_LZ4_decompress_safe_partial(buf, huge,
compressed,
12, HUGE_SIZE);
assert_se(r >= 0);
log_info("Decompressed partial %i/%i → %i", 12, HUGE_SIZE, r);
for (size_t size = 1; size < sizeof(buf2); size++) {
/* This failed in older lz4s but works in newer ones. */
r = sym_LZ4_decompress_safe_partial(buf, buf2, compressed, size, size);
log_info("Decompressed partial %zu/%zu → %i (%s)", size, size, r,
r < 0 ? "bad" : "good");
if (r >= 0 && sym_LZ4_versionNumber() >= 10803)
/* lz4 <= 1.8.2 should fail that test, let's only check for newer ones */
assert_se(memcmp(buf2, huge, r) == 0);
}
}
#endif
int main(int argc, char *argv[]) {
#if HAVE_COMPRESSION
_unused_ const char text[] =
"text\0foofoofoofoo AAAA aaaaaaaaa ghost busters barbarbar FFF"
"foofoofoofoo AAAA aaaaaaaaa ghost busters barbarbar FFF";
/* The file to test compression on can be specified as the first argument */
const char *srcfile = argc > 1 ? argv[1] : argv[0];
char data[512] = "random\0";
_cleanup_free_ char *huge = NULL;
assert_se(huge = malloc(HUGE_SIZE));
memcpy(huge, "HUGE=", STRLEN("HUGE="));
memset(&huge[STRLEN("HUGE=")], 'x', HUGE_SIZE - STRLEN("HUGE=") - 1);
huge[HUGE_SIZE - 1] = '\0';
test_setup_logging(LOG_DEBUG);
random_bytes(data + 7, sizeof(data) - 7); random_bytes(data + 7, sizeof(data) - 7);
#if HAVE_XZ
test_compress_decompress("XZ", compress_blob_xz, decompress_blob_xz,
text, sizeof(text), false);
test_compress_decompress("XZ", compress_blob_xz, decompress_blob_xz,
data, sizeof(data), true);
test_decompress_startswith("XZ",
compress_blob_xz, decompress_startswith_xz,
text, sizeof(text), false);
test_decompress_startswith("XZ",
compress_blob_xz, decompress_startswith_xz,
data, sizeof(data), true);
test_decompress_startswith("XZ",
compress_blob_xz, decompress_startswith_xz,
huge, HUGE_SIZE, true);
test_compress_stream("XZ", "xzcat",
compress_stream_xz, decompress_stream_xz, srcfile);
test_decompress_stream_sparse("XZ", compress_stream_xz, decompress_stream_xz);
test_decompress_startswith_short("XZ", compress_blob_xz, decompress_startswith_xz);
#else
log_info("/* XZ test skipped */");
#endif
#if HAVE_LZ4
if (dlopen_lz4() >= 0) {
test_compress_decompress("LZ4", compress_blob_lz4, decompress_blob_lz4,
text, sizeof(text), false);
test_compress_decompress("LZ4", compress_blob_lz4, decompress_blob_lz4,
data, sizeof(data), true);
test_decompress_startswith("LZ4",
compress_blob_lz4, decompress_startswith_lz4,
text, sizeof(text), false);
test_decompress_startswith("LZ4",
compress_blob_lz4, decompress_startswith_lz4,
data, sizeof(data), true);
test_decompress_startswith("LZ4",
compress_blob_lz4, decompress_startswith_lz4,
huge, HUGE_SIZE, true);
test_compress_stream("LZ4", "lz4cat",
compress_stream_lz4, decompress_stream_lz4, srcfile);
test_decompress_stream_sparse("LZ4", compress_stream_lz4, decompress_stream_lz4);
test_lz4_decompress_partial();
test_decompress_startswith_short("LZ4", compress_blob_lz4, decompress_startswith_lz4);
} else
log_error("/* Can't load liblz4 */");
#else
log_info("/* LZ4 test skipped */");
#endif
#if HAVE_ZSTD
test_compress_decompress("ZSTD", compress_blob_zstd, decompress_blob_zstd,
text, sizeof(text), false);
test_compress_decompress("ZSTD", compress_blob_zstd, decompress_blob_zstd,
data, sizeof(data), true);
test_decompress_startswith("ZSTD",
compress_blob_zstd, decompress_startswith_zstd,
text, sizeof(text), false);
test_decompress_startswith("ZSTD",
compress_blob_zstd, decompress_startswith_zstd,
data, sizeof(data), true);
test_decompress_startswith("ZSTD",
compress_blob_zstd, decompress_startswith_zstd,
huge, HUGE_SIZE, true);
test_compress_stream("ZSTD", "zstdcat",
compress_stream_zstd, decompress_stream_zstd, srcfile);
test_decompress_stream_sparse("ZSTD", compress_stream_zstd, decompress_stream_zstd);
test_decompress_startswith_short("ZSTD", compress_blob_zstd, decompress_startswith_zstd);
#else
log_info("/* ZSTD test skipped */");
#endif
return 0; return 0;
#else
return log_tests_skipped("no compression algorithm supported");
#endif
} }
DEFINE_TEST_MAIN_WITH_INTRO(LOG_DEBUG, intro);

View File

@ -42,6 +42,7 @@ static int run(int argc, char **argv) {
* where .so versions change and distributions update, but systemd doesn't have the new so names * where .so versions change and distributions update, but systemd doesn't have the new so names
* around yet. */ * around yet. */
ASSERT_DLOPEN(dlopen_bzip2, HAVE_BZIP2);
ASSERT_DLOPEN(dlopen_bpf, HAVE_LIBBPF); ASSERT_DLOPEN(dlopen_bpf, HAVE_LIBBPF);
ASSERT_DLOPEN(dlopen_cryptsetup, HAVE_LIBCRYPTSETUP); ASSERT_DLOPEN(dlopen_cryptsetup, HAVE_LIBCRYPTSETUP);
ASSERT_DLOPEN(dlopen_dw, HAVE_ELFUTILS); ASSERT_DLOPEN(dlopen_dw, HAVE_ELFUTILS);
@ -60,14 +61,15 @@ static int run(int argc, char **argv) {
ASSERT_DLOPEN(dlopen_libpam, HAVE_PAM); ASSERT_DLOPEN(dlopen_libpam, HAVE_PAM);
ASSERT_DLOPEN(dlopen_libseccomp, HAVE_SECCOMP); ASSERT_DLOPEN(dlopen_libseccomp, HAVE_SECCOMP);
ASSERT_DLOPEN(dlopen_libselinux, HAVE_SELINUX); ASSERT_DLOPEN(dlopen_libselinux, HAVE_SELINUX);
ASSERT_DLOPEN(dlopen_xz, HAVE_XZ);
ASSERT_DLOPEN(dlopen_lz4, HAVE_LZ4); ASSERT_DLOPEN(dlopen_lz4, HAVE_LZ4);
ASSERT_DLOPEN(dlopen_lzma, HAVE_XZ);
ASSERT_DLOPEN(dlopen_p11kit, HAVE_P11KIT); ASSERT_DLOPEN(dlopen_p11kit, HAVE_P11KIT);
ASSERT_DLOPEN(dlopen_passwdqc, HAVE_PASSWDQC); ASSERT_DLOPEN(dlopen_passwdqc, HAVE_PASSWDQC);
ASSERT_DLOPEN(dlopen_pcre2, HAVE_PCRE2); ASSERT_DLOPEN(dlopen_pcre2, HAVE_PCRE2);
ASSERT_DLOPEN(dlopen_pwquality, HAVE_PWQUALITY); ASSERT_DLOPEN(dlopen_pwquality, HAVE_PWQUALITY);
ASSERT_DLOPEN(dlopen_qrencode, HAVE_QRENCODE); ASSERT_DLOPEN(dlopen_qrencode, HAVE_QRENCODE);
ASSERT_DLOPEN(dlopen_tpm2, HAVE_TPM2); ASSERT_DLOPEN(dlopen_tpm2, HAVE_TPM2);
ASSERT_DLOPEN(dlopen_zlib, HAVE_ZLIB);
ASSERT_DLOPEN(dlopen_zstd, HAVE_ZSTD); ASSERT_DLOPEN(dlopen_zstd, HAVE_ZSTD);
return 0; return 0;

View File

@ -1,307 +0,0 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <fcntl.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
#include "sd-bus.h"
#include "sd-event.h"
#include "bus-locator.h"
#include "bus-wait-for-jobs.h"
#include "event-util.h"
#include "fd-util.h"
#include "format-util.h"
#include "hashmap.h"
#include "path-util.h"
#include "pidref.h"
#include "process-util.h"
#include "random-util.h"
#include "rm-rf.h"
#include "signal-util.h"
#include "socket-util.h"
#include "tests.h"
#include "time-util.h"
#include "tmpfile-util.h"
#include "unit-def.h"
struct fake_pressure_context {
int fifo_fd;
int socket_fd;
};
static void *fake_pressure_thread(void *p) {
_cleanup_free_ struct fake_pressure_context *c = ASSERT_PTR(p);
_cleanup_close_ int cfd = -EBADF;
usleep_safe(150);
assert_se(write(c->fifo_fd, &(const char) { 'x' }, 1) == 1);
usleep_safe(150);
cfd = accept4(c->socket_fd, NULL, NULL, SOCK_CLOEXEC);
assert_se(cfd >= 0);
char buf[STRLEN("hello")+1] = {};
assert_se(read(cfd, buf, sizeof(buf)-1) == sizeof(buf)-1);
ASSERT_STREQ(buf, "hello");
assert_se(write(cfd, &(const char) { 'z' }, 1) == 1);
return NULL;
}
static int fake_pressure_callback(sd_event_source *s, void *userdata) {
int *value = userdata;
const char *d;
assert_se(s);
assert_se(sd_event_source_get_description(s, &d) >= 0);
*value *= d[0];
log_notice("memory pressure event: %s", d);
if (*value == 7 * 'f' * 's')
assert_se(sd_event_exit(sd_event_source_get_event(s), 0) >= 0);
return 0;
}
TEST(fake_pressure) {
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *ef = NULL;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_free_ char *j = NULL, *k = NULL;
_cleanup_(rm_rf_physical_and_freep) char *tmp = NULL;
_cleanup_close_ int fifo_fd = -EBADF, socket_fd = -EBADF;
union sockaddr_union sa;
pthread_t th;
int value = 7;
assert_se(sd_event_default(&e) >= 0);
assert_se(mkdtemp_malloc(NULL, &tmp) >= 0);
assert_se(j = path_join(tmp, "fifo"));
assert_se(mkfifo(j, 0600) >= 0);
fifo_fd = open(j, O_CLOEXEC|O_RDWR|O_NONBLOCK);
assert_se(fifo_fd >= 0);
assert_se(k = path_join(tmp, "sock"));
socket_fd = socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
assert_se(socket_fd >= 0);
assert_se(sockaddr_un_set_path(&sa.un, k) >= 0);
assert_se(bind(socket_fd, &sa.sa, sockaddr_un_len(&sa.un)) >= 0);
assert_se(listen(socket_fd, 1) >= 0);
/* Ideally we'd just allocate this on the stack, but AddressSanitizer doesn't like it if threads
* access each other's stack */
struct fake_pressure_context *fp = new(struct fake_pressure_context, 1);
assert_se(fp);
*fp = (struct fake_pressure_context) {
.fifo_fd = fifo_fd,
.socket_fd = socket_fd,
};
assert_se(pthread_create(&th, NULL, fake_pressure_thread, TAKE_PTR(fp)) == 0);
assert_se(setenv("MEMORY_PRESSURE_WATCH", j, /* override= */ true) >= 0);
assert_se(unsetenv("MEMORY_PRESSURE_WRITE") >= 0);
assert_se(sd_event_add_memory_pressure(e, &es, fake_pressure_callback, &value) >= 0);
assert_se(sd_event_source_set_description(es, "fifo event source") >= 0);
assert_se(setenv("MEMORY_PRESSURE_WATCH", k, /* override= */ true) >= 0);
assert_se(setenv("MEMORY_PRESSURE_WRITE", "aGVsbG8K", /* override= */ true) >= 0);
assert_se(sd_event_add_memory_pressure(e, &ef, fake_pressure_callback, &value) >= 0);
assert_se(sd_event_source_set_description(ef, "socket event source") >= 0);
assert_se(sd_event_loop(e) >= 0);
assert_se(value == 7 * 'f' * 's');
assert_se(pthread_join(th, NULL) == 0);
}
struct real_pressure_context {
sd_event_source *pid;
};
static int real_pressure_callback(sd_event_source *s, void *userdata) {
struct real_pressure_context *c = ASSERT_PTR(userdata);
const char *d;
assert_se(s);
assert_se(sd_event_source_get_description(s, &d) >= 0);
log_notice("real_memory pressure event: %s", d);
sd_event_trim_memory();
assert_se(c->pid);
assert_se(sd_event_source_send_child_signal(c->pid, SIGKILL, NULL, 0) >= 0);
c->pid = NULL;
return 0;
}
#define MMAP_SIZE (10 * 1024 * 1024)
_noreturn_ static void real_pressure_eat_memory(int pipe_fd) {
size_t ate = 0;
/* Allocates and touches 10M at a time, until runs out of memory */
char x;
assert_se(read(pipe_fd, &x, 1) == 1); /* Wait for the GO! */
for (;;) {
void *p;
p = mmap(NULL, MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
assert_se(p != MAP_FAILED);
log_info("Eating another %s.", FORMAT_BYTES(MMAP_SIZE));
memset(p, random_u32() & 0xFF, MMAP_SIZE);
ate += MMAP_SIZE;
log_info("Ate %s in total.", FORMAT_BYTES(ate));
usleep_safe(50 * USEC_PER_MSEC);
}
}
static int real_pressure_child_callback(sd_event_source *s, const siginfo_t *si, void *userdata) {
assert_se(s);
assert_se(si);
log_notice("child dead");
assert_se(si->si_signo == SIGCHLD);
assert_se(si->si_status == SIGKILL);
assert_se(si->si_code == CLD_KILLED);
assert_se(sd_event_exit(sd_event_source_get_event(s), 31) >= 0);
return 0;
}
TEST(real_pressure) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL, *reply = NULL;
_cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL;
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *cs = NULL;
_cleanup_(bus_wait_for_jobs_freep) BusWaitForJobs *w = NULL;
_cleanup_(sd_bus_flush_close_unrefp) sd_bus *bus = NULL;
_cleanup_close_pair_ int pipe_fd[2] = EBADF_PAIR;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_free_ char *scope = NULL;
const char *object;
int r;
r = sd_bus_open_system(&bus);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't connect to system bus");
assert_se(bus_wait_for_jobs_new(bus, &w) >= 0);
assert_se(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "StartTransientUnit") >= 0);
assert_se(asprintf(&scope, "test-%" PRIu64 ".scope", random_u64()) >= 0);
assert_se(sd_bus_message_append(m, "ss", scope, "fail") >= 0);
assert_se(sd_bus_message_open_container(m, 'a', "(sv)") >= 0);
assert_se(sd_bus_message_append(m, "(sv)", "PIDs", "au", 1, 0) >= 0);
assert_se(sd_bus_message_append(m, "(sv)", "MemoryAccounting", "b", true) >= 0);
assert_se(sd_bus_message_close_container(m) >= 0);
assert_se(sd_bus_message_append(m, "a(sa(sv))", 0) >= 0);
r = sd_bus_call(bus, m, 0, &error, &reply);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't issue transient unit call");
assert_se(sd_bus_message_read(reply, "o", &object) >= 0);
assert_se(bus_wait_for_jobs_one(w, object, /* flags= */ BUS_WAIT_JOBS_LOG_ERROR, /* extra_args= */ NULL) >= 0);
assert_se(sd_event_default(&e) >= 0);
assert_se(pipe2(pipe_fd, O_CLOEXEC) >= 0);
_cleanup_(pidref_done) PidRef pidref = PIDREF_NULL;
r = pidref_safe_fork("(eat-memory)", FORK_RESET_SIGNALS|FORK_DEATHSIG_SIGTERM, &pidref);
assert_se(r >= 0);
if (r == 0) {
real_pressure_eat_memory(pipe_fd[0]);
_exit(EXIT_SUCCESS);
}
assert_se(event_add_child_pidref(e, &cs, &pidref, WEXITED, real_pressure_child_callback, NULL) >= 0);
assert_se(sd_event_source_set_child_process_own(cs, true) >= 0);
assert_se(unsetenv("MEMORY_PRESSURE_WATCH") >= 0);
assert_se(unsetenv("MEMORY_PRESSURE_WRITE") >= 0);
struct real_pressure_context context = {
.pid = cs,
};
r = sd_event_add_memory_pressure(e, &es, real_pressure_callback, &context);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't allocate memory pressure fd");
assert_se(sd_event_source_set_description(es, "real pressure event source") >= 0);
assert_se(sd_event_source_set_memory_pressure_type(es, "some") == 0);
assert_se(sd_event_source_set_memory_pressure_type(es, "full") > 0);
assert_se(sd_event_source_set_memory_pressure_type(es, "full") == 0);
assert_se(sd_event_source_set_memory_pressure_type(es, "some") > 0);
assert_se(sd_event_source_set_memory_pressure_type(es, "some") == 0);
assert_se(sd_event_source_set_memory_pressure_period(es, 70 * USEC_PER_MSEC, USEC_PER_SEC) > 0);
assert_se(sd_event_source_set_memory_pressure_period(es, 70 * USEC_PER_MSEC, USEC_PER_SEC) == 0);
assert_se(sd_event_source_set_enabled(es, SD_EVENT_ONESHOT) >= 0);
_cleanup_free_ char *uo = NULL;
assert_se(uo = unit_dbus_path_from_name(scope));
uint64_t mcurrent = UINT64_MAX;
assert_se(sd_bus_get_property_trivial(bus, "org.freedesktop.systemd1", uo, "org.freedesktop.systemd1.Scope", "MemoryCurrent", &error, 't', &mcurrent) >= 0);
printf("current: %" PRIu64 "\n", mcurrent);
if (mcurrent == UINT64_MAX)
return (void) log_tests_skipped_errno(r, "memory accounting not available");
m = sd_bus_message_unref(m);
assert_se(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "SetUnitProperties") >= 0);
assert_se(sd_bus_message_append(m, "sb", scope, true) >= 0);
assert_se(sd_bus_message_open_container(m, 'a', "(sv)") >= 0);
assert_se(sd_bus_message_append(m, "(sv)", "MemoryHigh", "t", mcurrent + (15 * 1024 * 1024)) >= 0);
assert_se(sd_bus_message_append(m, "(sv)", "MemoryMax", "t", mcurrent + (50 * 1024 * 1024)) >= 0);
assert_se(sd_bus_message_close_container(m) >= 0);
assert_se(sd_bus_call(bus, m, 0, NULL, NULL) >= 0);
/* Generate some memory allocations via mempool */
#define NN (1024)
Hashmap **h = new(Hashmap*, NN);
for (int i = 0; i < NN; i++)
h[i] = hashmap_new(NULL);
for (int i = 0; i < NN; i++)
hashmap_free(h[i]);
free(h);
/* Now start eating memory */
assert_se(write(pipe_fd[1], &(const char) { 'x' }, 1) == 1);
assert_se(sd_event_loop(e) >= 0);
int ex = 0;
assert_se(sd_event_get_exit_code(e, &ex) >= 0);
assert_se(ex == 31);
}
static int outro(void) {
hashmap_trim_pools();
return 0;
}
DEFINE_TEST_MAIN_FULL(LOG_DEBUG, NULL, outro);

603
src/test/test-pressure.c Normal file
View File

@ -0,0 +1,603 @@
/* SPDX-License-Identifier: LGPL-2.1-or-later */
#include <fcntl.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
#include "sd-bus.h"
#include "sd-event.h"
#include "bus-locator.h"
#include "bus-wait-for-jobs.h"
#include "event-util.h"
#include "fd-util.h"
#include "format-util.h"
#include "hashmap.h"
#include "path-util.h"
#include "pidref.h"
#include "process-util.h"
#include "random-util.h"
#include "rm-rf.h"
#include "signal-util.h"
#include "socket-util.h"
#include "tests.h"
#include "time-util.h"
#include "tmpfile-util.h"
#include "unit-def.h"
/* Shared infrastructure for fake pressure tests */
struct fake_pressure_context {
int fifo_fd;
int socket_fd;
};
static void *fake_pressure_thread(void *p) {
_cleanup_free_ struct fake_pressure_context *c = ASSERT_PTR(p);
_cleanup_close_ int cfd = -EBADF;
usleep_safe(150);
ASSERT_EQ(write(c->fifo_fd, &(const char) { 'x' }, 1), 1);
usleep_safe(150);
cfd = accept4(c->socket_fd, NULL, NULL, SOCK_CLOEXEC);
ASSERT_OK_ERRNO(cfd);
char buf[STRLEN("hello")+1] = {};
ASSERT_EQ(read(cfd, buf, sizeof(buf)-1), (ssize_t) (sizeof(buf)-1));
ASSERT_STREQ(buf, "hello");
ASSERT_EQ(write(cfd, &(const char) { 'z' }, 1), 1);
return NULL;
}
static int fake_pressure_callback(sd_event_source *s, void *userdata) {
int *value = userdata;
const char *d;
ASSERT_NOT_NULL(s);
ASSERT_OK(sd_event_source_get_description(s, &d));
*value *= d[0];
log_notice("pressure event: %s", d);
if (*value == 7 * 'f' * 's')
ASSERT_OK(sd_event_exit(sd_event_source_get_event(s), 0));
return 0;
}
typedef int (*event_add_pressure_t)(sd_event *, sd_event_source **, sd_event_handler_t, void *);
static void test_fake_pressure(
const char *resource,
event_add_pressure_t add_pressure) {
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *ef = NULL;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_(rm_rf_physical_and_freep) char *tmp = NULL;
_cleanup_close_ int fifo_fd = -EBADF, socket_fd = -EBADF;
union sockaddr_union sa;
pthread_t th;
int value = 7;
_cleanup_free_ char *resource_upper = ASSERT_NOT_NULL(strdup(resource));
ascii_strupper(resource_upper);
_cleanup_free_ char *env_watch = ASSERT_NOT_NULL(strjoin(resource_upper, "_PRESSURE_WATCH")),
*env_write = ASSERT_NOT_NULL(strjoin(resource_upper, "_PRESSURE_WRITE"));
ASSERT_OK(sd_event_default(&e));
ASSERT_OK(mkdtemp_malloc(NULL, &tmp));
_cleanup_free_ char *j = ASSERT_NOT_NULL(path_join(tmp, "fifo"));
ASSERT_OK_ERRNO(mkfifo(j, 0600));
fifo_fd = open(j, O_CLOEXEC|O_RDWR|O_NONBLOCK);
ASSERT_OK_ERRNO(fifo_fd);
_cleanup_free_ char *k = ASSERT_NOT_NULL(path_join(tmp, "sock"));
socket_fd = socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
ASSERT_OK_ERRNO(socket_fd);
ASSERT_OK(sockaddr_un_set_path(&sa.un, k));
ASSERT_OK_ERRNO(bind(socket_fd, &sa.sa, sockaddr_un_len(&sa.un)));
ASSERT_OK_ERRNO(listen(socket_fd, 1));
/* Ideally we'd just allocate this on the stack, but AddressSanitizer doesn't like it if threads
* access each other's stack */
struct fake_pressure_context *fp = new(struct fake_pressure_context, 1);
ASSERT_NOT_NULL(fp);
*fp = (struct fake_pressure_context) {
.fifo_fd = fifo_fd,
.socket_fd = socket_fd,
};
ASSERT_EQ(pthread_create(&th, NULL, fake_pressure_thread, TAKE_PTR(fp)), 0);
ASSERT_OK_ERRNO(setenv(env_watch, j, /* override= */ true));
ASSERT_OK_ERRNO(unsetenv(env_write));
ASSERT_OK(add_pressure(e, &es, fake_pressure_callback, &value));
ASSERT_OK(sd_event_source_set_description(es, "fifo event source"));
ASSERT_OK_ERRNO(setenv(env_watch, k, /* override= */ true));
ASSERT_OK_ERRNO(setenv(env_write, "aGVsbG8K", /* override= */ true));
ASSERT_OK(add_pressure(e, &ef, fake_pressure_callback, &value));
ASSERT_OK(sd_event_source_set_description(ef, "socket event source"));
ASSERT_OK(sd_event_loop(e));
ASSERT_EQ(value, 7 * 'f' * 's');
ASSERT_EQ(pthread_join(th, NULL), 0);
}
static int fake_pressure_wrapper(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata) {
return sd_event_add_memory_pressure(e, ret, callback, userdata);
}
TEST(fake_memory_pressure) {
test_fake_pressure("memory", fake_pressure_wrapper);
}
static int fake_cpu_pressure_wrapper(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata) {
return sd_event_add_cpu_pressure(e, ret, callback, userdata);
}
TEST(fake_cpu_pressure) {
test_fake_pressure("cpu", fake_cpu_pressure_wrapper);
}
static int fake_io_pressure_wrapper(sd_event *e, sd_event_source **ret, sd_event_handler_t callback, void *userdata) {
return sd_event_add_io_pressure(e, ret, callback, userdata);
}
TEST(fake_io_pressure) {
test_fake_pressure("io", fake_io_pressure_wrapper);
}
/* Shared infrastructure for real pressure tests */
struct real_pressure_context {
sd_event_source *pid;
};
static int real_pressure_child_callback(sd_event_source *s, const siginfo_t *si, void *userdata) {
ASSERT_NOT_NULL(s);
ASSERT_NOT_NULL(si);
log_notice("child dead");
ASSERT_EQ(si->si_signo, SIGCHLD);
ASSERT_EQ(si->si_status, SIGKILL);
ASSERT_EQ(si->si_code, CLD_KILLED);
ASSERT_OK(sd_event_exit(sd_event_source_get_event(s), 31));
return 0;
}
/* Memory pressure real test */
static int real_memory_pressure_callback(sd_event_source *s, void *userdata) {
struct real_pressure_context *c = ASSERT_PTR(userdata);
const char *d;
ASSERT_NOT_NULL(s);
ASSERT_OK(sd_event_source_get_description(s, &d));
log_notice("real memory pressure event: %s", d);
sd_event_trim_memory();
ASSERT_NOT_NULL(c->pid);
ASSERT_OK(sd_event_source_send_child_signal(c->pid, SIGKILL, NULL, 0));
c->pid = NULL;
return 0;
}
#define MMAP_SIZE (10 * 1024 * 1024)
_noreturn_ static void real_pressure_eat_memory(int pipe_fd) {
size_t ate = 0;
/* Allocates and touches 10M at a time, until runs out of memory */
char x;
ASSERT_EQ(read(pipe_fd, &x, 1), 1); /* Wait for the GO! */
for (;;) {
void *p;
p = mmap(NULL, MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
ASSERT_TRUE(p != MAP_FAILED);
log_info("Eating another %s.", FORMAT_BYTES(MMAP_SIZE));
memset(p, random_u32() & 0xFF, MMAP_SIZE);
ate += MMAP_SIZE;
log_info("Ate %s in total.", FORMAT_BYTES(ate));
usleep_safe(50 * USEC_PER_MSEC);
}
}
TEST(real_memory_pressure) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL, *reply = NULL;
_cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL;
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *cs = NULL;
_cleanup_(bus_wait_for_jobs_freep) BusWaitForJobs *w = NULL;
_cleanup_(sd_bus_flush_close_unrefp) sd_bus *bus = NULL;
_cleanup_close_pair_ int pipe_fd[2] = EBADF_PAIR;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_free_ char *scope = NULL;
const char *object;
int r;
if (getuid() == 0)
r = sd_bus_open_system(&bus);
else
r = sd_bus_open_user(&bus);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't connect to bus");
ASSERT_OK(bus_wait_for_jobs_new(bus, &w));
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "StartTransientUnit"));
ASSERT_OK(asprintf(&scope, "test-%" PRIu64 ".scope", random_u64()));
ASSERT_OK(sd_bus_message_append(m, "ss", scope, "fail"));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "PIDs", "au", 1, 0));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "MemoryAccounting", "b", true));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_message_append(m, "a(sa(sv))", 0));
r = sd_bus_call(bus, m, 0, &error, &reply);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't issue transient unit call");
ASSERT_OK(sd_bus_message_read(reply, "o", &object));
ASSERT_OK(bus_wait_for_jobs_one(w, object, /* flags= */ BUS_WAIT_JOBS_LOG_ERROR, /* extra_args= */ NULL));
ASSERT_OK(sd_event_default(&e));
ASSERT_OK_ERRNO(pipe2(pipe_fd, O_CLOEXEC));
_cleanup_(pidref_done) PidRef pidref = PIDREF_NULL;
r = pidref_safe_fork("(eat-memory)", FORK_RESET_SIGNALS|FORK_DEATHSIG_SIGTERM, &pidref);
ASSERT_OK(r);
if (r == 0) {
real_pressure_eat_memory(pipe_fd[0]);
_exit(EXIT_SUCCESS);
}
ASSERT_OK(event_add_child_pidref(e, &cs, &pidref, WEXITED, real_pressure_child_callback, NULL));
ASSERT_OK(sd_event_source_set_child_process_own(cs, true));
ASSERT_OK_ERRNO(unsetenv("MEMORY_PRESSURE_WATCH"));
ASSERT_OK_ERRNO(unsetenv("MEMORY_PRESSURE_WRITE"));
struct real_pressure_context context = {
.pid = cs,
};
r = sd_event_add_memory_pressure(e, &es, real_memory_pressure_callback, &context);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't allocate memory pressure fd");
ASSERT_OK(sd_event_source_set_description(es, "real pressure event source"));
ASSERT_OK_ZERO(sd_event_source_set_memory_pressure_type(es, "some"));
ASSERT_OK_POSITIVE(sd_event_source_set_memory_pressure_type(es, "full"));
ASSERT_OK_ZERO(sd_event_source_set_memory_pressure_type(es, "full"));
ASSERT_OK_POSITIVE(sd_event_source_set_memory_pressure_type(es, "some"));
ASSERT_OK_ZERO(sd_event_source_set_memory_pressure_type(es, "some"));
/* Unprivileged writes require a minimum of 2s otherwise the kernel will refuse the write. */
ASSERT_OK_POSITIVE(sd_event_source_set_memory_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK_ZERO(sd_event_source_set_memory_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK(sd_event_source_set_enabled(es, SD_EVENT_ONESHOT));
_cleanup_free_ char *uo = NULL;
ASSERT_NOT_NULL(uo = unit_dbus_path_from_name(scope));
uint64_t mcurrent = UINT64_MAX;
ASSERT_OK(sd_bus_get_property_trivial(bus, "org.freedesktop.systemd1", uo, "org.freedesktop.systemd1.Scope", "MemoryCurrent", &error, 't', &mcurrent));
printf("current: %" PRIu64 "\n", mcurrent);
if (mcurrent == UINT64_MAX)
return (void) log_tests_skipped_errno(r, "memory accounting not available");
m = sd_bus_message_unref(m);
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "SetUnitProperties"));
ASSERT_OK(sd_bus_message_append(m, "sb", scope, true));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "MemoryHigh", "t", mcurrent + (15 * 1024 * 1024)));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "MemoryMax", "t", mcurrent + (50 * 1024 * 1024)));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_call(bus, m, 0, NULL, NULL));
/* Generate some memory allocations via mempool */
#define NN (1024)
Hashmap **h = new(Hashmap*, NN);
for (int i = 0; i < NN; i++)
h[i] = hashmap_new(NULL);
for (int i = 0; i < NN; i++)
hashmap_free(h[i]);
free(h);
/* Now start eating memory */
ASSERT_EQ(write(pipe_fd[1], &(const char) { 'x' }, 1), 1);
ASSERT_OK(sd_event_loop(e));
int ex = 0;
ASSERT_OK(sd_event_get_exit_code(e, &ex));
ASSERT_EQ(ex, 31);
}
/* CPU pressure real test */
static int real_cpu_pressure_callback(sd_event_source *s, void *userdata) {
struct real_pressure_context *c = ASSERT_PTR(userdata);
const char *d;
ASSERT_NOT_NULL(s);
ASSERT_OK(sd_event_source_get_description(s, &d));
log_notice("real cpu pressure event: %s", d);
ASSERT_NOT_NULL(c->pid);
ASSERT_OK(sd_event_source_send_child_signal(c->pid, SIGKILL, NULL, 0));
c->pid = NULL;
return 0;
}
_noreturn_ static void real_pressure_eat_cpu(int pipe_fd) {
char x;
ASSERT_EQ(read(pipe_fd, &x, 1), 1); /* Wait for the GO! */
/* Busy-loop to generate CPU pressure */
for (;;)
__asm__ volatile("" ::: "memory"); /* Prevent optimization */
}
TEST(real_cpu_pressure) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL, *reply = NULL;
_cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL;
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *cs = NULL;
_cleanup_(bus_wait_for_jobs_freep) BusWaitForJobs *w = NULL;
_cleanup_(sd_bus_flush_close_unrefp) sd_bus *bus = NULL;
_cleanup_close_pair_ int pipe_fd[2] = EBADF_PAIR;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_free_ char *scope = NULL;
const char *object;
int r;
if (getuid() == 0)
r = sd_bus_open_system(&bus);
else
r = sd_bus_open_user(&bus);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't connect to bus");
ASSERT_OK(bus_wait_for_jobs_new(bus, &w));
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "StartTransientUnit"));
ASSERT_OK(asprintf(&scope, "test-%" PRIu64 ".scope", random_u64()));
ASSERT_OK(sd_bus_message_append(m, "ss", scope, "fail"));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "PIDs", "au", 1, 0));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "CPUAccounting", "b", true));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_message_append(m, "a(sa(sv))", 0));
r = sd_bus_call(bus, m, 0, &error, &reply);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't issue transient unit call");
ASSERT_OK(sd_bus_message_read(reply, "o", &object));
ASSERT_OK(bus_wait_for_jobs_one(w, object, /* flags= */ BUS_WAIT_JOBS_LOG_ERROR, /* extra_args= */ NULL));
ASSERT_OK(sd_event_default(&e));
ASSERT_OK_ERRNO(pipe2(pipe_fd, O_CLOEXEC));
_cleanup_(pidref_done) PidRef pidref = PIDREF_NULL;
r = pidref_safe_fork("(eat-cpu)", FORK_RESET_SIGNALS|FORK_DEATHSIG_SIGTERM, &pidref);
ASSERT_OK(r);
if (r == 0) {
real_pressure_eat_cpu(pipe_fd[0]);
_exit(EXIT_SUCCESS);
}
ASSERT_OK(event_add_child_pidref(e, &cs, &pidref, WEXITED, real_pressure_child_callback, NULL));
ASSERT_OK(sd_event_source_set_child_process_own(cs, true));
ASSERT_OK_ERRNO(unsetenv("CPU_PRESSURE_WATCH"));
ASSERT_OK_ERRNO(unsetenv("CPU_PRESSURE_WRITE"));
struct real_pressure_context context = {
.pid = cs,
};
r = sd_event_add_cpu_pressure(e, &es, real_cpu_pressure_callback, &context);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't allocate cpu pressure fd");
ASSERT_OK(sd_event_source_set_description(es, "real pressure event source"));
ASSERT_OK_ZERO(sd_event_source_set_cpu_pressure_type(es, "some"));
/* Unprivileged writes require a minimum of 2s otherwise the kernel will refuse the write. */
ASSERT_OK_POSITIVE(sd_event_source_set_cpu_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK_ZERO(sd_event_source_set_cpu_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK(sd_event_source_set_enabled(es, SD_EVENT_ONESHOT));
m = sd_bus_message_unref(m);
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "SetUnitProperties"));
ASSERT_OK(sd_bus_message_append(m, "sb", scope, true));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "CPUQuotaPerSecUSec", "t", (uint64_t) 1000)); /* 0.1% CPU */
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_call(bus, m, 0, NULL, NULL));
/* Now start eating CPU */
ASSERT_EQ(write(pipe_fd[1], &(const char) { 'x' }, 1), 1);
ASSERT_OK(sd_event_loop(e));
int ex = 0;
ASSERT_OK(sd_event_get_exit_code(e, &ex));
ASSERT_EQ(ex, 31);
}
/* IO pressure real test */
static int real_io_pressure_callback(sd_event_source *s, void *userdata) {
struct real_pressure_context *c = ASSERT_PTR(userdata);
const char *d;
ASSERT_NOT_NULL(s);
ASSERT_OK(sd_event_source_get_description(s, &d));
log_notice("real io pressure event: %s", d);
ASSERT_NOT_NULL(c->pid);
ASSERT_OK(sd_event_source_send_child_signal(c->pid, SIGKILL, NULL, 0));
c->pid = NULL;
return 0;
}
_noreturn_ static void real_pressure_eat_io(int pipe_fd) {
char x;
ASSERT_EQ(read(pipe_fd, &x, 1), 1); /* Wait for the GO! */
/* Write and fsync in a loop to generate IO pressure */
for (;;) {
_cleanup_close_ int fd = -EBADF;
fd = open("/var/tmp/.io-pressure-test", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0600);
if (fd < 0)
continue;
char buf[4096];
memset(buf, 'x', sizeof(buf));
for (int i = 0; i < 256; i++)
if (write(fd, buf, sizeof(buf)) < 0)
break;
(void) fsync(fd);
}
}
TEST(real_io_pressure) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL, *reply = NULL;
_cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL;
_cleanup_(sd_event_source_unrefp) sd_event_source *es = NULL, *cs = NULL;
_cleanup_(bus_wait_for_jobs_freep) BusWaitForJobs *w = NULL;
_cleanup_(sd_bus_flush_close_unrefp) sd_bus *bus = NULL;
_cleanup_close_pair_ int pipe_fd[2] = EBADF_PAIR;
_cleanup_(sd_event_unrefp) sd_event *e = NULL;
_cleanup_free_ char *scope = NULL;
const char *object;
int r;
if (getuid() == 0)
r = sd_bus_open_system(&bus);
else
r = sd_bus_open_user(&bus);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't connect to bus");
ASSERT_OK(bus_wait_for_jobs_new(bus, &w));
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "StartTransientUnit"));
ASSERT_OK(asprintf(&scope, "test-%" PRIu64 ".scope", random_u64()));
ASSERT_OK(sd_bus_message_append(m, "ss", scope, "fail"));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "PIDs", "au", 1, 0));
ASSERT_OK(sd_bus_message_append(m, "(sv)", "IOAccounting", "b", true));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_message_append(m, "a(sa(sv))", 0));
r = sd_bus_call(bus, m, 0, &error, &reply);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't issue transient unit call");
ASSERT_OK(sd_bus_message_read(reply, "o", &object));
ASSERT_OK(bus_wait_for_jobs_one(w, object, /* flags= */ BUS_WAIT_JOBS_LOG_ERROR, /* extra_args= */ NULL));
ASSERT_OK(sd_event_default(&e));
ASSERT_OK_ERRNO(pipe2(pipe_fd, O_CLOEXEC));
_cleanup_(pidref_done) PidRef pidref = PIDREF_NULL;
r = pidref_safe_fork("(eat-io)", FORK_RESET_SIGNALS|FORK_DEATHSIG_SIGTERM, &pidref);
ASSERT_OK(r);
if (r == 0) {
real_pressure_eat_io(pipe_fd[0]);
_exit(EXIT_SUCCESS);
}
ASSERT_OK(event_add_child_pidref(e, &cs, &pidref, WEXITED, real_pressure_child_callback, NULL));
ASSERT_OK(sd_event_source_set_child_process_own(cs, true));
ASSERT_OK_ERRNO(unsetenv("IO_PRESSURE_WATCH"));
ASSERT_OK_ERRNO(unsetenv("IO_PRESSURE_WRITE"));
struct real_pressure_context context = {
.pid = cs,
};
r = sd_event_add_io_pressure(e, &es, real_io_pressure_callback, &context);
if (r < 0)
return (void) log_tests_skipped_errno(r, "can't allocate io pressure fd");
ASSERT_OK(sd_event_source_set_description(es, "real pressure event source"));
ASSERT_OK_ZERO(sd_event_source_set_io_pressure_type(es, "some"));
/* Unprivileged writes require a minimum of 2s otherwise the kernel will refuse the write. */
ASSERT_OK_POSITIVE(sd_event_source_set_io_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK_ZERO(sd_event_source_set_io_pressure_period(es, 70 * USEC_PER_MSEC, 2 * USEC_PER_SEC));
ASSERT_OK(sd_event_source_set_enabled(es, SD_EVENT_ONESHOT));
m = sd_bus_message_unref(m);
ASSERT_OK(bus_message_new_method_call(bus, &m, bus_systemd_mgr, "SetUnitProperties"));
ASSERT_OK(sd_bus_message_append(m, "sb", scope, true));
ASSERT_OK(sd_bus_message_open_container(m, 'a', "(sv)"));
ASSERT_OK(sd_bus_message_open_container(m, 'r', "sv"));
ASSERT_OK(sd_bus_message_append(m, "s", "IOWriteBandwidthMax"));
ASSERT_OK(sd_bus_message_open_container(m, 'v', "a(st)"));
ASSERT_OK(sd_bus_message_append(m, "a(st)", 1, "/var/tmp", (uint64_t) 1024*1024)); /* 1M/s */
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_message_close_container(m));
ASSERT_OK(sd_bus_call(bus, m, 0, NULL, NULL));
/* Now start eating IO */
ASSERT_EQ(write(pipe_fd[1], &(const char) { 'x' }, 1), 1);
ASSERT_OK(sd_event_loop(e));
int ex = 0;
ASSERT_OK(sd_event_get_exit_code(e, &ex));
ASSERT_EQ(ex, 31);
}
static int outro(void) {
(void) unlink("/var/tmp/.io-pressure-test");
hashmap_trim_pools();
return 0;
}
DEFINE_TEST_MAIN_FULL(LOG_DEBUG, NULL, outro);

View File

@ -168,7 +168,7 @@ static int dev_if_packed_info(sd_device *dev, char *ifs_str, size_t len) {
desc = (struct usb_interface_descriptor *) (buf + pos); desc = (struct usb_interface_descriptor *) (buf + pos);
if (desc->bLength < 3) if (desc->bLength < 3)
break; break;
if (desc->bLength > size - sizeof(struct usb_interface_descriptor)) if (desc->bLength > (size_t) size - pos)
return log_device_debug_errno(dev, SYNTHETIC_ERRNO(EIO), return log_device_debug_errno(dev, SYNTHETIC_ERRNO(EIO),
"Corrupt data read from \"%s\"", filename); "Corrupt data read from \"%s\"", filename);
pos += desc->bLength; pos += desc->bLength;

View File

@ -91,7 +91,7 @@ foreach dirname : [
'TEST-74-AUX-UTILS', 'TEST-74-AUX-UTILS',
'TEST-75-RESOLVED', 'TEST-75-RESOLVED',
'TEST-78-SIGQUEUE', 'TEST-78-SIGQUEUE',
'TEST-79-MEMPRESS', 'TEST-79-PRESSURE',
'TEST-80-NOTIFYACCESS', 'TEST-80-NOTIFYACCESS',
'TEST-81-GENERATORS', 'TEST-81-GENERATORS',
'TEST-82-SOFTREBOOT', 'TEST-82-SOFTREBOOT',

View File

@ -17,6 +17,13 @@ EOF
systemctl reset-failed systemd-journald.service systemctl reset-failed systemd-journald.service
for c in NONE XZ LZ4 ZSTD; do for c in NONE XZ LZ4 ZSTD; do
# compression_to_string() returns "uncompressed" for COMPRESSION_NONE
if [[ "${c}" == NONE ]]; then
log_name="uncompressed"
else
log_name="${c,,}"
fi
cat >/run/systemd/system/systemd-journald.service.d/compress.conf <<EOF cat >/run/systemd/system/systemd-journald.service.d/compress.conf <<EOF
[Service] [Service]
Environment=SYSTEMD_JOURNAL_COMPRESS=${c} Environment=SYSTEMD_JOURNAL_COMPRESS=${c}
@ -28,14 +35,20 @@ EOF
ID="$(systemd-id128 new)" ID="$(systemd-id128 new)"
systemd-cat -t "$ID" bash -c "for ((i=0;i<100;i++)); do echo -n hoge with ${c}; done; echo" systemd-cat -t "$ID" bash -c "for ((i=0;i<100;i++)); do echo -n hoge with ${c}; done; echo"
journalctl --sync journalctl --sync
timeout 10 bash -c "until SYSTEMD_LOG_LEVEL=debug journalctl --verify --quiet --file /var/log/journal/$MACHINE_ID/system.journal 2>&1 | grep -F 'compress=${c}' >/dev/null; do sleep .5; done" timeout 10 bash -c "until SYSTEMD_LOG_LEVEL=debug journalctl --verify --quiet --file /var/log/journal/$MACHINE_ID/system.journal 2>&1 | grep -F 'compress=${log_name}' >/dev/null; do sleep .5; done"
# $SYSTEMD_JOURNAL_COMPRESS= also works for journal-remote # $SYSTEMD_JOURNAL_COMPRESS= also works for journal-remote
if [[ -x /usr/lib/systemd/systemd-journal-remote ]]; then if [[ -x /usr/lib/systemd/systemd-journal-remote ]]; then
for cc in NONE XZ LZ4 ZSTD; do for cc in NONE XZ LZ4 ZSTD; do
if [[ "${cc}" == NONE ]]; then
cc_log_name="uncompressed"
else
cc_log_name="${cc,,}"
fi
rm -f /tmp/foo.journal rm -f /tmp/foo.journal
SYSTEMD_JOURNAL_COMPRESS="${cc}" /usr/lib/systemd/systemd-journal-remote --split-mode=none -o /tmp/foo.journal --getter="journalctl -b -o export -t $ID" SYSTEMD_JOURNAL_COMPRESS="${cc}" /usr/lib/systemd/systemd-journal-remote --split-mode=none -o /tmp/foo.journal --getter="journalctl -b -o export -t $ID"
SYSTEMD_LOG_LEVEL=debug journalctl --verify --quiet --file /tmp/foo.journal 2>&1 | grep -F "compress=${cc}" >/dev/null SYSTEMD_LOG_LEVEL=debug journalctl --verify --quiet --file /tmp/foo.journal 2>&1 | grep -F "compress=${cc_log_name}" >/dev/null
journalctl -t "$ID" -o cat --file /tmp/foo.journal | grep -F "hoge with ${c}" >/dev/null journalctl -t "$ID" -o cat --file /tmp/foo.journal | grep -F "hoge with ${c}" >/dev/null
done done
fi fi

View File

@ -1,64 +0,0 @@
#!/usr/bin/env bash
# SPDX-License-Identifier: LGPL-2.1-or-later
set -ex
set -o pipefail
# We not just test if the file exists, but try to read from it, since if
# CONFIG_PSI_DEFAULT_DISABLED is set in the kernel the file will exist and can
# be opened, but any read()s will fail with EOPNOTSUPP, which we want to
# detect.
if ! cat /proc/pressure/memory >/dev/null ; then
echo "kernel too old, has no PSI." >&2
echo OK >/testok
exit 0
fi
CGROUP=/sys/fs/cgroup/"$(systemctl show TEST-79-MEMPRESS.service -P ControlGroup)"
test -d "$CGROUP"
if ! test -f "$CGROUP"/memory.pressure ; then
echo "No memory accounting/PSI delegated via cgroup, can't test." >&2
echo OK >/testok
exit 0
fi
UNIT="test-mempress-$RANDOM.service"
SCRIPT="/tmp/mempress-$RANDOM.sh"
cat >"$SCRIPT" <<'EOF'
#!/usr/bin/env bash
set -ex
export
id
test -n "$MEMORY_PRESSURE_WATCH"
test "$MEMORY_PRESSURE_WATCH" != /dev/null
test -w "$MEMORY_PRESSURE_WATCH"
ls -al "$MEMORY_PRESSURE_WATCH"
EXPECTED="$(echo -n -e "some 123000 2000000\x00" | base64)"
test "$EXPECTED" = "$MEMORY_PRESSURE_WRITE"
EOF
chmod +x "$SCRIPT"
systemd-run \
-u "$UNIT" \
-p Type=exec \
-p ProtectControlGroups=1 \
-p DynamicUser=1 \
-p MemoryPressureWatch=on \
-p MemoryPressureThresholdSec=123ms \
-p BindPaths=$SCRIPT \
`# Make sanitizers happy when DynamicUser=1 pulls in instrumented systemd NSS modules` \
-p EnvironmentFile=-/usr/lib/systemd/systemd-asan-env \
--wait "$SCRIPT"
rm "$SCRIPT"
touch /testok

170
test/units/TEST-79-PRESSURE.sh Executable file
View File

@ -0,0 +1,170 @@
#!/usr/bin/env bash
# SPDX-License-Identifier: LGPL-2.1-or-later
set -ex
set -o pipefail
# We not just test if the file exists, but try to read from it, since if
# CONFIG_PSI_DEFAULT_DISABLED is set in the kernel the file will exist and can
# be opened, but any read()s will fail with EOPNOTSUPP, which we want to
# detect.
if ! cat /proc/pressure/memory >/dev/null ; then
echo "kernel too old, has no PSI." >&2
echo OK >/testok
exit 0
fi
CGROUP=/sys/fs/cgroup/"$(systemctl show TEST-79-PRESSURE.service -P ControlGroup)"
test -d "$CGROUP"
if ! test -f "$CGROUP"/memory.pressure ; then
echo "No memory accounting/PSI delegated via cgroup, can't test." >&2
echo OK >/testok
exit 0
fi
UNIT="test-mempress-$RANDOM.service"
SCRIPT="/tmp/mempress-$RANDOM.sh"
cat >"$SCRIPT" <<'EOF'
#!/usr/bin/env bash
set -ex
export
id
test -n "$MEMORY_PRESSURE_WATCH"
test "$MEMORY_PRESSURE_WATCH" != /dev/null
test -w "$MEMORY_PRESSURE_WATCH"
ls -al "$MEMORY_PRESSURE_WATCH"
EXPECTED="$(echo -n -e "some 123000 2000000\x00" | base64)"
test "$EXPECTED" = "$MEMORY_PRESSURE_WRITE"
EOF
chmod +x "$SCRIPT"
systemd-run \
-u "$UNIT" \
-p Type=exec \
-p ProtectControlGroups=1 \
-p DynamicUser=1 \
-p MemoryPressureWatch=on \
-p MemoryPressureThresholdSec=123ms \
-p BindPaths=$SCRIPT \
`# Make sanitizers happy when DynamicUser=1 pulls in instrumented systemd NSS modules` \
-p EnvironmentFile=-/usr/lib/systemd/systemd-asan-env \
--wait "$SCRIPT"
rm "$SCRIPT"
# Now test CPU pressure
if ! cat /proc/pressure/cpu >/dev/null ; then
echo "kernel has no CPU PSI support." >&2
echo OK >/testok
exit 0
fi
if ! test -f "$CGROUP"/cpu.pressure ; then
echo "No CPU accounting/PSI delegated via cgroup, can't test." >&2
echo OK >/testok
exit 0
fi
UNIT="test-cpupress-$RANDOM.service"
SCRIPT="/tmp/cpupress-$RANDOM.sh"
cat >"$SCRIPT" <<'EOF'
#!/usr/bin/env bash
set -ex
export
id
test -n "$CPU_PRESSURE_WATCH"
test "$CPU_PRESSURE_WATCH" != /dev/null
test -w "$CPU_PRESSURE_WATCH"
ls -al "$CPU_PRESSURE_WATCH"
EXPECTED="$(echo -n -e "some 123000 2000000\x00" | base64)"
test "$EXPECTED" = "$CPU_PRESSURE_WRITE"
EOF
chmod +x "$SCRIPT"
systemd-run \
-u "$UNIT" \
-p Type=exec \
-p ProtectControlGroups=1 \
-p DynamicUser=1 \
-p CPUPressureWatch=on \
-p CPUPressureThresholdSec=123ms \
-p BindPaths=$SCRIPT \
`# Make sanitizers happy when DynamicUser=1 pulls in instrumented systemd NSS modules` \
-p EnvironmentFile=-/usr/lib/systemd/systemd-asan-env \
--wait "$SCRIPT"
rm "$SCRIPT"
# Now test IO pressure
if ! cat /proc/pressure/io >/dev/null ; then
echo "kernel has no IO PSI support." >&2
echo OK >/testok
exit 0
fi
if ! test -f "$CGROUP"/io.pressure ; then
echo "No IO accounting/PSI delegated via cgroup, can't test." >&2
echo OK >/testok
exit 0
fi
UNIT="test-iopress-$RANDOM.service"
SCRIPT="/tmp/iopress-$RANDOM.sh"
cat >"$SCRIPT" <<'EOF'
#!/usr/bin/env bash
set -ex
export
id
test -n "$IO_PRESSURE_WATCH"
test "$IO_PRESSURE_WATCH" != /dev/null
test -w "$IO_PRESSURE_WATCH"
ls -al "$IO_PRESSURE_WATCH"
EXPECTED="$(echo -n -e "some 123000 2000000\x00" | base64)"
test "$EXPECTED" = "$IO_PRESSURE_WRITE"
EOF
chmod +x "$SCRIPT"
systemd-run \
-u "$UNIT" \
-p Type=exec \
-p ProtectControlGroups=1 \
-p DynamicUser=1 \
-p IOPressureWatch=on \
-p IOPressureThresholdSec=123ms \
-p BindPaths=$SCRIPT \
`# Make sanitizers happy when DynamicUser=1 pulls in instrumented systemd NSS modules` \
-p EnvironmentFile=-/usr/lib/systemd/systemd-asan-env \
--wait "$SCRIPT"
rm "$SCRIPT"
touch /testok