1
0
mirror of https://github.com/systemd/systemd synced 2026-04-23 23:44:50 +02:00

Compare commits

...

20 Commits

Author SHA1 Message Date
Xiaotian Wu
0172289348 basic: update the Arch tuples for LoongArch 2022-03-26 00:29:38 +09:00
Yu Watanabe
dd2396f20d
Merge pull request #22861 from poettering/journald-sigterm
journald: don't let SIGTERM starve indefinitely
2022-03-26 00:27:47 +09:00
Daan De Meyer
e5531be27b
Merge pull request #22857 from poettering/journal-file-flags
journal: add flags param to journal_file_open(), replacing bools
2022-03-25 10:16:12 +01:00
Lennart Poettering
47f04c2a69 journal-file: if we are going down, don't use event loop to schedule post
The event loop is already shutting down, hence no point in using it
anymore, it's not going to run any further iteration.
2022-03-25 10:03:00 +01:00
Lennart Poettering
19252b2548 journald: make sure SIGTERM handling doesn't get starved out
Fixes: #22642
2022-03-25 10:03:00 +01:00
Lennart Poettering
e540c5a621 journal: don't talk about -1 in context of unsigned values 2022-03-25 09:59:09 +01:00
Lennart Poettering
49615dbd81 journal-file: merge compress/seal bool args into a single flags param
Just some modernization/refactoring.

No change in behaviour, just let's do how we do things these days: use
flags param instead of list of bools.
2022-03-25 09:59:09 +01:00
Lennart Poettering
6fb57abcfa
Merge pull request #22717 from yuwata/udev-lock-block-device-by-main-process
udev: do not skip events when device is already locked
2022-03-25 09:43:12 +01:00
Lennart Poettering
a35420d85d journal-remote: constify a few parameters 2022-03-25 09:21:38 +01:00
Lennart Poettering
88a19c7e04
Merge pull request #22859 from poettering/hardware-rename
machine-info: rename VENDOR=/MODEL= → HARDWARE_VENDOR=/HARDWARE_MODEL=
2022-03-25 09:12:14 +01:00
Frantisek Sumsal
41d1aaea64 test: use flock when calling mkfs.btrfs
As stated in https://github.com/systemd/systemd/issues/21819#issuecomment-1064377645
`mkfs.btrfs` doesn't hold the lock for the whole duration of
`mkfs.btrfs`, thus causing unexpected races & test fails. Let's
wrap the `mkfs.btrfs` calls in an flock wrapper to mitigate this.

Hopefully fixes: #21819
2022-03-25 10:28:07 +09:00
Frantisek Sumsal
ed1cbdc347 Revert "test: temporary workaround for #21819"
This reverts commit 95e35511bbdb7810c00c2e4a6cbda5b187192f74.
2022-03-25 10:28:07 +09:00
Yu Watanabe
82a5de9fd2 udev: assume block device is not locked when a new event is queued
Then, hopefully, previously requeued events are processed earlier.
2022-03-25 10:28:07 +09:00
Yu Watanabe
7b7959fba5 udev: split worker_lock_block_device() into two
This also makes return value initialized when these function return 0 to
follow our coding style.

Just a preparation for later commits.
2022-03-25 10:28:07 +09:00
Yu Watanabe
5d354e525a udev: requeue event when the corresponding block device is locked by another process
Previously, if a block device is locked by another process, then the
corresponding worker skip to process the corresponding event, and does
not broadcast the uevent to libudev listners. This causes several issues:

- During a period of a device being locked by a process, if a user trigger
  an event with `udevadm trigger --settle`, then it never returned.

- When there is a delay between close and unlock in a process, then the
  synthesized events triggered by inotify may not be processed. This can
  happens easily by wrapping mkfs with flock. This causes severe issues
  e.g. new devlinks are not created, or old devlinks are not removed.

This commit makes events are requeued with a tiny delay when the corresponding
block devices are locked by other processes. With this way, the triggered
uevent may be delayed but is always processed by udevd. Hence, the above
issues can be solved. Also, it is not necessary to watch a block device
unconditionally when it is already locked. Hence, the logic is dropped.
2022-03-25 10:28:03 +09:00
Yu Watanabe
0c3d8182c9 udev: store action in struct Event 2022-03-25 10:25:27 +09:00
Yu Watanabe
c17ab900cb udev: introduce device_broadcast() helper function 2022-03-25 10:25:26 +09:00
Yu Watanabe
c9473aaa5b udev: drop unnecessary clone of received sd-device object
As the sd-device object received through sd-device-monitor is sealed,
so the corresponding udev database or uevent file will not be read.
2022-03-25 10:25:26 +09:00
Lennart Poettering
38639aa28f hostnamed: properly reset hw model/vendor props before re-reading them
Follow-up for 4fc7e4f374bf4401330e90e267227267abf1dcac
2022-03-24 21:29:13 +01:00
Lennart Poettering
0924ea2b26 machine-info: rename VENDOR=/MODEL= → HARDWARE_VENDOR=/HARDWARE_MODEL=
Let's be more precise here. Otherwise people might think this describes
the software system or so. We already expose this via hostnamed as
HardwareVendor/HardwareModel hence use the exact same wording.

(Note that the relevant props on the dmi device are just VENDOR/MODEL,
but that's OK given that DMI really is about hardware anyway,
unconditionally, hence no chance of confusion there.)

Follow-up for 4fc7e4f374bf4401330e90e267227267abf1dcac
2022-03-24 21:29:13 +01:00
28 changed files with 546 additions and 275 deletions

4
NEWS
View File

@ -184,8 +184,8 @@ CHANGES WITH 251 in spe:
'portablectl attach --extension=' now also accepts directory paths.
* VENDOR= and MODEL= can be set in /etc/machine-info to override the
values gleaned from the hwdb.
* HARDWARE_VENDOR= and HARDWARE_MODEL= can be set in /etc/machine-info
to override the values gleaned from the hwdb.
* A ID_CHASSIS property can be set in the hwdb (for the DMI device
/sys/class/dmi/id) to override the chassis that is reported by

View File

@ -130,17 +130,19 @@
</varlistentry>
<varlistentry>
<term><varname>VENDOR=</varname></term>
<term><varname>HARDWARE_VENDOR=</varname></term>
<listitem><para>Specifies the hardware vendor. If unspecified, the hardware vendor set in DMI
or hwdb will be used.</para></listitem>
<listitem><para>Specifies the hardware vendor. If unspecified, the hardware vendor set in DMI or
<citerefentry><refentrytitle>hwdb</refentrytitle><manvolnum>7</manvolnum></citerefentry> will be
used.</para></listitem>
</varlistentry>
<varlistentry>
<term><varname>MODEL=</varname></term>
<term><varname>HARDWARE_MODEL=</varname></term>
<listitem><para>Specifies the hardware model. If unspecified, the hardware model set in DMI or
hwdb will be used.</para></listitem>
<citerefentry><refentrytitle>hwdb</refentrytitle><manvolnum>7</manvolnum></citerefentry> will be
used.</para></listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@ -199,9 +199,16 @@ int uname_architecture(void);
# define LIB_ARCH_TUPLE "sh4a-linux-gnu"
# endif
#elif defined(__loongarch64)
# pragma message "Please update the Arch tuple of loongarch64 after psABI is stable"
# define native_architecture() ARCHITECTURE_LOONGARCH64
# define LIB_ARCH_TUPLE "loongarch64-linux-gnu"
# define native_architecture() ARCHITECTURE_LOONGARCH64
# if defined(__loongarch_double_float)
# define LIB_ARCH_TUPLE "loongarch64-linux-gnuf64"
# elif defined(__loongarch_single_float)
# define LIB_ARCH_TUPLE "loongarch64-linux-gnuf32"
# elif defined(__loongarch_soft_float)
# define LIB_ARCH_TUPLE "loongarch64-linux-gnusf"
# else
# error "Unrecognized loongarch architecture variant"
# endif
#elif defined(__m68k__)
# define native_architecture() ARCHITECTURE_M68K
# define LIB_ARCH_TUPLE "m68k-linux-gnu"

View File

@ -53,8 +53,8 @@ typedef enum {
PROP_CHASSIS,
PROP_DEPLOYMENT,
PROP_LOCATION,
PROP_VENDOR,
PROP_MODEL,
PROP_HARDWARE_VENDOR,
PROP_HARDWARE_MODEL,
/* Read from /etc/os-release (or /usr/lib/os-release) */
PROP_OS_PRETTY_NAME,
@ -128,7 +128,9 @@ static void context_read_machine_info(Context *c) {
(UINT64_C(1) << PROP_ICON_NAME) |
(UINT64_C(1) << PROP_CHASSIS) |
(UINT64_C(1) << PROP_DEPLOYMENT) |
(UINT64_C(1) << PROP_LOCATION));
(UINT64_C(1) << PROP_LOCATION) |
(UINT64_C(1) << PROP_HARDWARE_VENDOR) |
(UINT64_C(1) << PROP_HARDWARE_MODEL));
r = parse_env_file(NULL, "/etc/machine-info",
"PRETTY_HOSTNAME", &c->data[PROP_PRETTY_HOSTNAME],
@ -136,8 +138,8 @@ static void context_read_machine_info(Context *c) {
"CHASSIS", &c->data[PROP_CHASSIS],
"DEPLOYMENT", &c->data[PROP_DEPLOYMENT],
"LOCATION", &c->data[PROP_LOCATION],
"VENDOR", &c->data[PROP_VENDOR],
"MODEL", &c->data[PROP_MODEL]);
"HARDWARE_VENDOR", &c->data[PROP_HARDWARE_VENDOR],
"HARDWARE_MODEL", &c->data[PROP_HARDWARE_MODEL]);
if (r < 0 && r != -ENOENT)
log_warning_errno(r, "Failed to read /etc/machine-info, ignoring: %m");
@ -563,7 +565,7 @@ static int property_get_hardware_property(
assert(reply);
assert(c);
assert(IN_SET(prop, PROP_VENDOR, PROP_MODEL));
assert(IN_SET(prop, PROP_HARDWARE_VENDOR, PROP_HARDWARE_MODEL));
assert(getter);
context_read_machine_info(c);
@ -583,7 +585,7 @@ static int property_get_hardware_vendor(
void *userdata,
sd_bus_error *error) {
return property_get_hardware_property(reply, userdata, PROP_VENDOR, get_hardware_vendor);
return property_get_hardware_property(reply, userdata, PROP_HARDWARE_VENDOR, get_hardware_vendor);
}
static int property_get_hardware_model(
@ -595,7 +597,7 @@ static int property_get_hardware_model(
void *userdata,
sd_bus_error *error) {
return property_get_hardware_property(reply, userdata, PROP_MODEL, get_hardware_model);
return property_get_hardware_property(reply, userdata, PROP_HARDWARE_MODEL, get_hardware_model);
}
static int property_get_hostname(
@ -1179,9 +1181,9 @@ static int method_describe(sd_bus_message *m, void *userdata, sd_bus_error *erro
assert_se(uname(&u) >= 0);
if (isempty(c->data[PROP_VENDOR]))
if (isempty(c->data[PROP_HARDWARE_VENDOR]))
(void) get_hardware_vendor(&vendor);
if (isempty(c->data[PROP_MODEL]))
if (isempty(c->data[PROP_HARDWARE_MODEL]))
(void) get_hardware_model(&model);
if (privileged) {
@ -1206,8 +1208,8 @@ static int method_describe(sd_bus_message *m, void *userdata, sd_bus_error *erro
JSON_BUILD_PAIR("OperatingSystemPrettyName", JSON_BUILD_STRING(c->data[PROP_OS_PRETTY_NAME])),
JSON_BUILD_PAIR("OperatingSystemCPEName", JSON_BUILD_STRING(c->data[PROP_OS_CPE_NAME])),
JSON_BUILD_PAIR("OperatingSystemHomeURL", JSON_BUILD_STRING(c->data[PROP_OS_HOME_URL])),
JSON_BUILD_PAIR("HardwareVendor", JSON_BUILD_STRING(vendor ?: c->data[PROP_VENDOR])),
JSON_BUILD_PAIR("HardwareModel", JSON_BUILD_STRING(model ?: c->data[PROP_MODEL])),
JSON_BUILD_PAIR("HardwareVendor", JSON_BUILD_STRING(vendor ?: c->data[PROP_HARDWARE_VENDOR])),
JSON_BUILD_PAIR("HardwareModel", JSON_BUILD_STRING(model ?: c->data[PROP_HARDWARE_MODEL])),
JSON_BUILD_PAIR("HardwareSerial", JSON_BUILD_STRING(serial)),
JSON_BUILD_PAIR_CONDITION(!sd_id128_is_null(product_uuid), "ProductUUID", JSON_BUILD_ID128(product_uuid)),
JSON_BUILD_PAIR_CONDITION(sd_id128_is_null(product_uuid), "ProductUUID", JSON_BUILD_NULL)));

View File

@ -43,7 +43,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
/* In */
r = journal_remote_server_init(&s, name, JOURNAL_WRITE_SPLIT_NONE, false, false);
r = journal_remote_server_init(&s, name, JOURNAL_WRITE_SPLIT_NONE, 0);
if (r < 0) {
assert_se(IN_SET(r, -ENOMEM, -EMFILE, -ENFILE));
return r;

View File

@ -223,9 +223,7 @@ static int process_http_upload(
finished = true;
for (;;) {
r = process_source(source,
journal_remote_server_global->compress,
journal_remote_server_global->seal);
r = process_source(source, journal_remote_server_global->file_flags);
if (r == -EAGAIN)
break;
if (r < 0) {
@ -599,7 +597,12 @@ static int create_remoteserver(
int r, n, fd;
r = journal_remote_server_init(s, arg_output, arg_split_mode, arg_compress, arg_seal);
r = journal_remote_server_init(
s,
arg_output,
arg_split_mode,
(arg_compress ? JOURNAL_COMPRESS : 0) |
(arg_seal ? JOURNAL_SEAL : 0));
if (r < 0)
return r;

View File

@ -47,7 +47,7 @@ RemoteSource* source_new(int fd, bool passive_fd, char *name, Writer *writer) {
return source;
}
int process_source(RemoteSource *source, bool compress, bool seal) {
int process_source(RemoteSource *source, JournalFileFlags file_flags) {
int r;
assert(source);
@ -72,7 +72,7 @@ int process_source(RemoteSource *source, bool compress, bool seal) {
&source->importer.iovw,
&source->importer.ts,
&source->importer.boot_id,
compress, seal);
file_flags);
if (r == -EBADMSG) {
log_warning_errno(r, "Entry is invalid, ignoring.");
r = 0;

View File

@ -17,4 +17,4 @@ typedef struct RemoteSource {
RemoteSource* source_new(int fd, bool passive_fd, char *name, Writer *writer);
void source_free(RemoteSource *source);
int process_source(RemoteSource *source, bool compress, bool seal);
int process_source(RemoteSource *source, JournalFileFlags file_flags);

View File

@ -3,8 +3,10 @@
#include "alloc-util.h"
#include "journal-remote.h"
static int do_rotate(ManagedJournalFile **f, MMapCache *m, bool compress, bool seal) {
int r = managed_journal_file_rotate(f, m, compress, UINT64_MAX, seal, NULL);
static int do_rotate(ManagedJournalFile **f, MMapCache *m, JournalFileFlags file_flags) {
int r;
r = managed_journal_file_rotate(f, m, file_flags, UINT64_MAX, NULL);
if (r < 0) {
if (*f)
log_error_errno(r, "Failed to rotate %s: %m", (*f)->file->path);
@ -57,11 +59,10 @@ static Writer* writer_free(Writer *w) {
DEFINE_TRIVIAL_REF_UNREF_FUNC(Writer, writer, writer_free);
int writer_write(Writer *w,
struct iovec_wrapper *iovw,
dual_timestamp *ts,
sd_id128_t *boot_id,
bool compress,
bool seal) {
const struct iovec_wrapper *iovw,
const dual_timestamp *ts,
const sd_id128_t *boot_id,
JournalFileFlags file_flags) {
int r;
assert(w);
@ -71,7 +72,7 @@ int writer_write(Writer *w,
if (journal_file_rotate_suggested(w->journal->file, 0, LOG_DEBUG)) {
log_info("%s: Journal header limits reached or header out-of-date, rotating",
w->journal->file->path);
r = do_rotate(&w->journal, w->mmap, compress, seal);
r = do_rotate(&w->journal, w->mmap, file_flags);
if (r < 0)
return r;
}
@ -87,7 +88,7 @@ int writer_write(Writer *w,
return r;
log_debug_errno(r, "%s: Write failed, rotating: %m", w->journal->file->path);
r = do_rotate(&w->journal, w->mmap, compress, seal);
r = do_rotate(&w->journal, w->mmap, file_flags);
if (r < 0)
return r;
else

View File

@ -26,11 +26,10 @@ Writer* writer_unref(Writer *w);
DEFINE_TRIVIAL_CLEANUP_FUNC(Writer*, writer_unref);
int writer_write(Writer *s,
struct iovec_wrapper *iovw,
dual_timestamp *ts,
sd_id128_t *boot_id,
bool compress,
bool seal);
const struct iovec_wrapper *iovw,
const dual_timestamp *ts,
const sd_id128_t *boot_id,
JournalFileFlags file_flags);
typedef enum JournalWriteSplitMode {
JOURNAL_WRITE_SPLIT_NONE,

View File

@ -61,12 +61,17 @@ static int open_output(RemoteServer *s, Writer *w, const char* host) {
assert_not_reached();
}
r = managed_journal_file_open_reliably(filename,
O_RDWR|O_CREAT, 0640,
s->compress, UINT64_MAX, s->seal,
&w->metrics,
w->mmap, NULL,
NULL, &w->journal);
r = managed_journal_file_open_reliably(
filename,
O_RDWR|O_CREAT,
s->file_flags,
0640,
UINT64_MAX,
&w->metrics,
w->mmap,
NULL,
NULL,
&w->journal);
if (r < 0)
return log_error_errno(r, "Failed to open output journal %s: %m", filename);
@ -302,8 +307,7 @@ int journal_remote_server_init(
RemoteServer *s,
const char *output,
JournalWriteSplitMode split_mode,
bool compress,
bool seal) {
JournalFileFlags file_flags) {
int r;
@ -313,8 +317,7 @@ int journal_remote_server_init(
journal_remote_server_global = s;
s->split_mode = split_mode;
s->compress = compress;
s->seal = seal;
s->file_flags = file_flags;
if (output)
s->output = output;
@ -391,7 +394,7 @@ int journal_remote_handle_raw_source(
source = s->sources[fd];
assert(source->importer.fd == fd);
r = process_source(source, s->compress, s->seal);
r = process_source(source, s->file_flags);
if (journal_importer_eof(&source->importer)) {
size_t remaining;

View File

@ -38,8 +38,7 @@ struct RemoteServer {
const char *output; /* either the output file or directory */
JournalWriteSplitMode split_mode;
bool compress;
bool seal;
JournalFileFlags file_flags;
bool check_trust;
};
extern RemoteServer *journal_remote_server_global;
@ -48,8 +47,7 @@ int journal_remote_server_init(
RemoteServer *s,
const char *output,
JournalWriteSplitMode split_mode,
bool compress,
bool seal);
JournalFileFlags file_flags);
int journal_remote_get_writer(RemoteServer *s, const char *host, Writer **writer);

View File

@ -260,26 +260,46 @@ static int open_journal(
Server *s,
bool reliably,
const char *fname,
int flags,
int open_flags,
bool seal,
JournalMetrics *metrics,
ManagedJournalFile **ret) {
_cleanup_(managed_journal_file_closep) ManagedJournalFile *f = NULL;
JournalFileFlags file_flags;
int r;
assert(s);
assert(fname);
assert(ret);
file_flags = (s->compress.enabled ? JOURNAL_COMPRESS : 0) | (seal ? JOURNAL_SEAL : 0);
if (reliably)
r = managed_journal_file_open_reliably(fname, flags, 0640, s->compress.enabled,
s->compress.threshold_bytes, seal, metrics, s->mmap,
s->deferred_closes, NULL, &f);
r = managed_journal_file_open_reliably(
fname,
open_flags,
file_flags,
0640,
s->compress.threshold_bytes,
metrics,
s->mmap,
s->deferred_closes,
NULL,
&f);
else
r = managed_journal_file_open(-1, fname, flags, 0640, s->compress.enabled,
s->compress.threshold_bytes, seal, metrics, s->mmap,
s->deferred_closes, NULL, &f);
r = managed_journal_file_open(
-1,
fname,
open_flags,
file_flags,
0640,
s->compress.threshold_bytes,
metrics,
s->mmap,
s->deferred_closes,
NULL,
&f);
if (r < 0)
return r;
@ -457,13 +477,19 @@ static int do_rotate(
bool seal,
uint32_t uid) {
JournalFileFlags file_flags;
int r;
assert(s);
if (!*f)
return -EINVAL;
r = managed_journal_file_rotate(f, s->mmap, s->compress.enabled, s->compress.threshold_bytes, seal, s->deferred_closes);
file_flags =
(s->compress.enabled ? JOURNAL_COMPRESS : 0)|
(seal ? JOURNAL_SEAL : 0);
r = managed_journal_file_rotate(f, s->mmap, file_flags, s->compress.threshold_bytes, s->deferred_closes);
if (r < 0) {
if (*f)
return log_error_errno(r, "Failed to rotate %s: %m", (*f)->file->path);
@ -574,18 +600,19 @@ static int vacuum_offline_user_journals(Server *s) {
server_vacuum_deferred_closes(s);
/* Open the file briefly, so that we can archive it */
r = managed_journal_file_open(fd,
full,
O_RDWR,
0640,
s->compress.enabled,
s->compress.threshold_bytes,
s->seal,
&s->system_storage.metrics,
s->mmap,
s->deferred_closes,
NULL,
&f);
r = managed_journal_file_open(
fd,
full,
O_RDWR,
(s->compress.enabled ? JOURNAL_COMPRESS : 0) |
(s->seal ? JOURNAL_SEAL : 0),
0640,
s->compress.threshold_bytes,
&s->system_storage.metrics,
s->mmap,
s->deferred_closes,
NULL,
&f);
if (r < 0) {
log_warning_errno(r, "Failed to read journal file %s for rotation, trying to move it out of the way: %m", full);
@ -1447,12 +1474,82 @@ static int dispatch_sigusr2(sd_event_source *es, const struct signalfd_siginfo *
}
static int dispatch_sigterm(sd_event_source *es, const struct signalfd_siginfo *si, void *userdata) {
_cleanup_(sd_event_source_disable_unrefp) sd_event_source *news = NULL;
Server *s = userdata;
int r;
assert(s);
log_received_signal(LOG_INFO, si);
(void) sd_event_source_set_enabled(es, false); /* Make sure this handler is called at most once */
/* So on one hand we want to ensure that SIGTERMs are definitely handled in appropriate, bounded
* time. On the other hand we want that everything pending is first comprehensively processed and
* written to disk. These goals are incompatible, hence we try to find a middle ground: we'll process
* SIGTERM with high priority, but from the handler (this one right here) we'll install two new event
* sources: one low priority idle one that will issue the exit once everything else is processed (and
* which is hopefully the regular, clean codepath); and one high priority timer that acts as safety
* net: if our idle handler isn't run within 10s, we'll exit anyway.
*
* TLDR: we'll exit either when everything is processed, or after 10s max, depending on what happens
* first.
*
* Note that exiting before the idle event is hit doesn't typically mean that we lose any data, as
* messages will remain queued in the sockets they came in from, and thus can be processed when we
* start up next unless we are going down for the final system shutdown, in which case everything
* is lost. */
r = sd_event_add_defer(s->event, &news, NULL, NULL); /* NULL handler means → exit when triggered */
if (r < 0) {
log_error_errno(r, "Failed to allocate exit idle event handler: %m");
goto fail;
}
(void) sd_event_source_set_description(news, "exit-idle");
/* Run everything relevant before this. */
r = sd_event_source_set_priority(news, SD_EVENT_PRIORITY_NORMAL+20);
if (r < 0) {
log_error_errno(r, "Failed to adjust priority of exit idle event handler: %m");
goto fail;
}
/* Give up ownership, so that this event source is freed automatically when the event loop is freed. */
r = sd_event_source_set_floating(news, true);
if (r < 0) {
log_error_errno(r, "Failed to make exit idle event handler floating: %m");
goto fail;
}
news = sd_event_source_unref(news);
r = sd_event_add_time_relative(s->event, &news, CLOCK_MONOTONIC, 10 * USEC_PER_SEC, 0, NULL, NULL);
if (r < 0) {
log_error_errno(r, "Failed to allocate exit timeout event handler: %m");
goto fail;
}
(void) sd_event_source_set_description(news, "exit-timeout");
r = sd_event_source_set_priority(news, SD_EVENT_PRIORITY_IMPORTANT-20); /* This is a safety net, with highest priority */
if (r < 0) {
log_error_errno(r, "Failed to adjust priority of exit timeout event handler: %m");
goto fail;
}
r = sd_event_source_set_floating(news, true);
if (r < 0) {
log_error_errno(r, "Failed to make exit timeout event handler floating: %m");
goto fail;
}
news = sd_event_source_unref(news);
log_debug("Exit event sources are now pending.");
return 0;
fail:
sd_event_exit(s->event, 0);
return 0;
}
@ -1504,8 +1601,8 @@ static int setup_signals(Server *s) {
if (r < 0)
return r;
/* Let's process SIGTERM late, so that we flush all queued messages to disk before we exit */
r = sd_event_source_set_priority(s->sigterm_event_source, SD_EVENT_PRIORITY_NORMAL+20);
/* Let's process SIGTERM early, so that we definitely react to it */
r = sd_event_source_set_priority(s->sigterm_event_source, SD_EVENT_PRIORITY_IMPORTANT-10);
if (r < 0)
return r;
@ -1515,7 +1612,7 @@ static int setup_signals(Server *s) {
if (r < 0)
return r;
r = sd_event_source_set_priority(s->sigint_event_source, SD_EVENT_PRIORITY_NORMAL+20);
r = sd_event_source_set_priority(s->sigint_event_source, SD_EVENT_PRIORITY_IMPORTANT-10);
if (r < 0)
return r;

View File

@ -393,11 +393,10 @@ ManagedJournalFile* managed_journal_file_close(ManagedJournalFile *f) {
int managed_journal_file_open(
int fd,
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
Set *deferred_closes,
@ -412,7 +411,7 @@ int managed_journal_file_open(
if (!f)
return -ENOMEM;
r = journal_file_open(fd, fname, flags, mode, compress, compress_threshold_bytes, seal, metrics,
r = journal_file_open(fd, fname, open_flags, file_flags, mode, compress_threshold_bytes, metrics,
mmap_cache, template ? template->file : NULL, &f->file);
if (r < 0)
return r;
@ -444,9 +443,8 @@ ManagedJournalFile* managed_journal_file_initiate_close(ManagedJournalFile *f, S
int managed_journal_file_rotate(
ManagedJournalFile **f,
MMapCache *mmap_cache,
bool compress,
JournalFileFlags file_flags,
uint64_t compress_threshold_bytes,
bool seal,
Set *deferred_closes) {
_cleanup_free_ char *path = NULL;
@ -463,11 +461,10 @@ int managed_journal_file_rotate(
r = managed_journal_file_open(
-1,
path,
(*f)->file->flags,
(*f)->file->open_flags,
file_flags,
(*f)->file->mode,
compress,
compress_threshold_bytes,
seal,
NULL, /* metrics */
mmap_cache,
deferred_closes,
@ -482,11 +479,10 @@ int managed_journal_file_rotate(
int managed_journal_file_open_reliably(
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
Set *deferred_closes,
@ -495,7 +491,7 @@ int managed_journal_file_open_reliably(
int r;
r = managed_journal_file_open(-1, fname, flags, mode, compress, compress_threshold_bytes, seal, metrics,
r = managed_journal_file_open(-1, fname, open_flags, file_flags, mode, compress_threshold_bytes, metrics,
mmap_cache, deferred_closes, template, ret);
if (!IN_SET(r,
-EBADMSG, /* Corrupted */
@ -509,10 +505,10 @@ int managed_journal_file_open_reliably(
-ETXTBSY)) /* File is from the future */
return r;
if ((flags & O_ACCMODE) == O_RDONLY)
if ((open_flags & O_ACCMODE) == O_RDONLY)
return r;
if (!(flags & O_CREAT))
if (!(open_flags & O_CREAT))
return r;
if (!endswith(fname, ".journal"))
@ -525,6 +521,6 @@ int managed_journal_file_open_reliably(
if (r < 0)
return r;
return managed_journal_file_open(-1, fname, flags, mode, compress, compress_threshold_bytes, seal, metrics,
return managed_journal_file_open(-1, fname, open_flags, file_flags, mode, compress_threshold_bytes, metrics,
mmap_cache, deferred_closes, template, ret);
}

View File

@ -10,11 +10,10 @@ typedef struct {
int managed_journal_file_open(
int fd,
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
Set *deferred_closes,
@ -28,11 +27,10 @@ DEFINE_TRIVIAL_CLEANUP_FUNC(ManagedJournalFile*, managed_journal_file_close);
int managed_journal_file_open_reliably(
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
Set *deferred_closes,
@ -40,4 +38,4 @@ int managed_journal_file_open_reliably(
ManagedJournalFile **ret);
ManagedJournalFile* managed_journal_file_initiate_close(ManagedJournalFile *f, Set *deferred_closes);
int managed_journal_file_rotate(ManagedJournalFile **f, MMapCache *mmap_cache, bool compress, uint64_t compress_threshold_bytes, bool seal, Set *deferred_closes);
int managed_journal_file_rotate(ManagedJournalFile **f, MMapCache *mmap_cache, JournalFileFlags file_flags, uint64_t compress_threshold_bytes, Set *deferred_closes);

View File

@ -29,7 +29,7 @@ int main(int argc, char *argv[]) {
fn = path_join(dn, "test.journal");
r = managed_journal_file_open(-1, fn, O_CREAT|O_RDWR, 0644, false, 0, false, NULL, m, NULL, NULL, &new_journal);
r = managed_journal_file_open(-1, fn, O_CREAT|O_RDWR, 0, 0644, 0, NULL, m, NULL, NULL, &new_journal);
assert_se(r >= 0);
if (argc > 1)

View File

@ -40,7 +40,7 @@ static ManagedJournalFile *test_open(const char *name) {
m = mmap_cache_new();
assert_se(m != NULL);
assert_ret(managed_journal_file_open(-1, name, O_RDWR|O_CREAT, 0644, true, UINT64_MAX, false, NULL, m, NULL, NULL, &f));
assert_ret(managed_journal_file_open(-1, name, O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0644, UINT64_MAX, NULL, m, NULL, NULL, &f));
return f;
}
@ -218,8 +218,8 @@ TEST(sequence_numbers) {
mkdtemp_chdir_chattr(t);
assert_se(managed_journal_file_open(-1, "one.journal", O_RDWR|O_CREAT, 0644,
true, UINT64_MAX, false, NULL, m, NULL, NULL, &one) == 0);
assert_se(managed_journal_file_open(-1, "one.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0644,
UINT64_MAX, NULL, m, NULL, NULL, &one) == 0);
append_number(one, 1, &seqnum);
printf("seqnum=%"PRIu64"\n", seqnum);
@ -235,8 +235,8 @@ TEST(sequence_numbers) {
memcpy(&seqnum_id, &one->file->header->seqnum_id, sizeof(sd_id128_t));
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR|O_CREAT, 0644,
true, UINT64_MAX, false, NULL, m, NULL, one, &two) == 0);
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0644,
UINT64_MAX, NULL, m, NULL, one, &two) == 0);
assert_se(two->file->header->state == STATE_ONLINE);
assert_se(!sd_id128_equal(two->file->header->file_id, one->file->header->file_id));
@ -266,8 +266,8 @@ TEST(sequence_numbers) {
/* restart server */
seqnum = 0;
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR, 0,
true, UINT64_MAX, false, NULL, m, NULL, NULL, &two) == 0);
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR, JOURNAL_COMPRESS, 0,
UINT64_MAX, NULL, m, NULL, NULL, &two) == 0);
assert_se(sd_id128_equal(two->file->header->seqnum_id, seqnum_id));

View File

@ -77,9 +77,9 @@ static void run_test(void) {
assert_se(chdir(t) >= 0);
(void) chattr_path(t, FS_NOCOW_FL, FS_NOCOW_FL, NULL);
assert_se(managed_journal_file_open(-1, "one.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, false, NULL, m, NULL, NULL, &one) == 0);
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, false, NULL, m, NULL, NULL, &two) == 0);
assert_se(managed_journal_file_open(-1, "three.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, false, NULL, m, NULL, NULL, &three) == 0);
assert_se(managed_journal_file_open(-1, "one.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0666, UINT64_MAX, NULL, m, NULL, NULL, &one) == 0);
assert_se(managed_journal_file_open(-1, "two.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0666, UINT64_MAX, NULL, m, NULL, NULL, &two) == 0);
assert_se(managed_journal_file_open(-1, "three.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0666, UINT64_MAX, NULL, m, NULL, NULL, &three) == 0);
for (i = 0; i < N_ENTRIES; i++) {
char *p, *q;

View File

@ -46,7 +46,7 @@ static int raw_verify(const char *fn, const char *verification_key) {
m = mmap_cache_new();
assert_se(m != NULL);
r = journal_file_open(-1, fn, O_RDONLY, 0666, true, UINT64_MAX, !!verification_key, NULL, m, NULL, &f);
r = journal_file_open(-1, fn, O_RDONLY, JOURNAL_COMPRESS|(verification_key ? JOURNAL_SEAL : 0), 0666, UINT64_MAX, NULL, m, NULL, &f);
if (r < 0)
return r;
@ -82,7 +82,7 @@ int main(int argc, char *argv[]) {
log_info("Generating...");
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, !!verification_key, NULL, m, NULL, NULL, &df) == 0);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS|(verification_key ? JOURNAL_SEAL : 0), 0666, UINT64_MAX, NULL, m, NULL, NULL, &df) == 0);
for (n = 0; n < N_ENTRIES; n++) {
struct iovec iovec;
@ -104,7 +104,7 @@ int main(int argc, char *argv[]) {
log_info("Verifying...");
assert_se(journal_file_open(-1, "test.journal", O_RDONLY, 0666, true, UINT64_MAX, !!verification_key, NULL, m, NULL, &f) == 0);
assert_se(journal_file_open(-1, "test.journal", O_RDONLY, JOURNAL_COMPRESS|(verification_key ? JOURNAL_SEAL: 0), 0666, UINT64_MAX, NULL, m, NULL, &f) == 0);
/* journal_file_print_header(f); */
journal_file_dump(f);

View File

@ -39,7 +39,7 @@ TEST(non_empty) {
mkdtemp_chdir_chattr(t);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, true, NULL, m, NULL, NULL, &f) == 0);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS|JOURNAL_SEAL, 0666, UINT64_MAX, NULL, m, NULL, NULL, &f) == 0);
assert_se(dual_timestamp_get(&ts));
assert_se(sd_id128_randomize(&fake_boot_id) == 0);
@ -100,8 +100,8 @@ TEST(non_empty) {
assert_se(journal_file_move_to_entry_by_seqnum(f->file, 10, DIRECTION_DOWN, &o, NULL) == 0);
managed_journal_file_rotate(&f, m, true, UINT64_MAX, true, NULL);
managed_journal_file_rotate(&f, m, true, UINT64_MAX, true, NULL);
managed_journal_file_rotate(&f, m, JOURNAL_SEAL|JOURNAL_COMPRESS, UINT64_MAX, NULL);
managed_journal_file_rotate(&f, m, JOURNAL_SEAL|JOURNAL_COMPRESS, UINT64_MAX, NULL);
(void) managed_journal_file_close(f);
@ -128,10 +128,10 @@ TEST(empty) {
mkdtemp_chdir_chattr(t);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, 0666, false, UINT64_MAX, false, NULL, m, NULL, NULL, &f1) == 0);
assert_se(managed_journal_file_open(-1, "test-compress.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, false, NULL, m, NULL, NULL, &f2) == 0);
assert_se(managed_journal_file_open(-1, "test-seal.journal", O_RDWR|O_CREAT, 0666, false, UINT64_MAX, true, NULL, m, NULL, NULL, &f3) == 0);
assert_se(managed_journal_file_open(-1, "test-seal-compress.journal", O_RDWR|O_CREAT, 0666, true, UINT64_MAX, true, NULL, m, NULL, NULL, &f4) == 0);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, 0, 0666, UINT64_MAX, NULL, m, NULL, NULL, &f1) == 0);
assert_se(managed_journal_file_open(-1, "test-compress.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS, 0666, UINT64_MAX, NULL, m, NULL, NULL, &f2) == 0);
assert_se(managed_journal_file_open(-1, "test-seal.journal", O_RDWR|O_CREAT, JOURNAL_SEAL, 0666, UINT64_MAX, NULL, m, NULL, NULL, &f3) == 0);
assert_se(managed_journal_file_open(-1, "test-seal-compress.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS|JOURNAL_SEAL, 0666, UINT64_MAX, NULL, m, NULL, NULL, &f4) == 0);
journal_file_print_header(f1->file);
puts("");
@ -178,7 +178,7 @@ static bool check_compressed(uint64_t compress_threshold, uint64_t data_size) {
mkdtemp_chdir_chattr(t);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, 0666, true, compress_threshold, true, NULL, m, NULL, NULL, &f) == 0);
assert_se(managed_journal_file_open(-1, "test.journal", O_RDWR|O_CREAT, JOURNAL_COMPRESS|JOURNAL_SEAL, 0666, compress_threshold, NULL, m, NULL, NULL, &f) == 0);
dual_timestamp_get(&ts);

View File

@ -1931,11 +1931,18 @@ static int post_change_thunk(sd_event_source *timer, uint64_t usec, void *userda
}
static void schedule_post_change(JournalFile *f) {
sd_event *e;
int r;
assert(f);
assert(f->post_change_timer);
assert_se(e = sd_event_source_get_event(f->post_change_timer));
/* If we are aleady going down, post the change immediately. */
if (IN_SET(sd_event_get_state(e), SD_EVENT_EXITING, SD_EVENT_FINISHED))
goto fail;
r = sd_event_source_get_enabled(f->post_change_timer, NULL);
if (r < 0) {
log_debug_errno(r, "Failed to get ftruncate timer state: %m");
@ -3314,11 +3321,10 @@ static int journal_file_warn_btrfs(JournalFile *f) {
int journal_file_open(
int fd,
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
JournalFile *template,
@ -3333,13 +3339,13 @@ int journal_file_open(
assert(fd >= 0 || fname);
assert(mmap_cache);
if (!IN_SET((flags & O_ACCMODE), O_RDONLY, O_RDWR))
if (!IN_SET((open_flags & O_ACCMODE), O_RDONLY, O_RDWR))
return -EINVAL;
if ((flags & O_ACCMODE) == O_RDONLY && FLAGS_SET(flags, O_CREAT))
if ((open_flags & O_ACCMODE) == O_RDONLY && FLAGS_SET(open_flags, O_CREAT))
return -EINVAL;
if (fname && (flags & O_CREAT) && !endswith(fname, ".journal"))
if (fname && (open_flags & O_CREAT) && !endswith(fname, ".journal"))
return -EINVAL;
f = new(JournalFile, 1);
@ -3350,21 +3356,21 @@ int journal_file_open(
.fd = fd,
.mode = mode,
.flags = flags,
.writable = (flags & O_ACCMODE) != O_RDONLY,
.open_flags = open_flags,
.writable = (open_flags & O_ACCMODE) != O_RDONLY,
#if HAVE_ZSTD
.compress_zstd = compress,
.compress_zstd = FLAGS_SET(file_flags, JOURNAL_COMPRESS),
#elif HAVE_LZ4
.compress_lz4 = compress,
.compress_lz4 = FLAGS_SET(file_flags, JOURNAL_COMPRESS),
#elif HAVE_XZ
.compress_xz = compress,
.compress_xz = FLAGS_SET(file_flags, JOURNAL_COMPRESS),
#endif
.compress_threshold_bytes = compress_threshold_bytes == UINT64_MAX ?
DEFAULT_COMPRESS_THRESHOLD :
MAX(MIN_COMPRESS_THRESHOLD, compress_threshold_bytes),
#if HAVE_GCRYPT
.seal = seal,
.seal = FLAGS_SET(file_flags, JOURNAL_SEAL),
#endif
};
@ -3424,7 +3430,7 @@ int journal_file_open(
* or so, we likely fail quickly than block for long. For regular files O_NONBLOCK has no effect, hence
* it doesn't hurt in that case. */
f->fd = openat_report_new(AT_FDCWD, f->path, f->flags|O_CLOEXEC|O_NONBLOCK, f->mode, &newly_created);
f->fd = openat_report_new(AT_FDCWD, f->path, f->open_flags|O_CLOEXEC|O_NONBLOCK, f->mode, &newly_created);
if (f->fd < 0) {
r = f->fd;
goto fail;
@ -3451,7 +3457,7 @@ int journal_file_open(
newly_created = f->last_stat.st_size == 0 && f->writable;
}
f->cache_fd = mmap_cache_add_fd(mmap_cache, f->fd, prot_from_flags(flags));
f->cache_fd = mmap_cache_add_fd(mmap_cache, f->fd, prot_from_flags(open_flags));
if (!f->cache_fd) {
r = -ENOMEM;
goto fail;

View File

@ -18,7 +18,7 @@
#include "time-util.h"
typedef struct JournalMetrics {
/* For all these: -1 means "pick automatically", and 0 means "no limit enforced" */
/* For all these: UINT64_MAX means "pick automatically", and 0 means "no limit enforced" */
uint64_t max_size; /* how large journal files grow at max */
uint64_t min_size; /* how large journal files grow at least */
uint64_t max_use; /* how much disk space to use in total at max, keep_free permitting */
@ -62,7 +62,7 @@ typedef struct JournalFile {
mode_t mode;
int flags;
int open_flags;
bool writable:1;
bool compress_xz:1;
bool compress_lz4:1;
@ -126,14 +126,18 @@ typedef struct JournalFile {
#endif
} JournalFile;
typedef enum JournalFileFlags {
JOURNAL_COMPRESS = 1 << 0,
JOURNAL_SEAL = 1 << 1,
} JournalFileFlags;
int journal_file_open(
int fd,
const char *fname,
int flags,
int open_flags,
JournalFileFlags file_flags,
mode_t mode,
bool compress,
uint64_t compress_threshold_bytes,
bool seal,
JournalMetrics *metrics,
MMapCache *mmap_cache,
JournalFile *template,

View File

@ -1337,7 +1337,7 @@ static int add_any_file(
goto finish;
}
r = journal_file_open(fd, path, O_RDONLY, 0, false, 0, false, NULL, j->mmap, NULL, &f);
r = journal_file_open(fd, path, O_RDONLY, 0, 0, 0, NULL, j->mmap, NULL, &f);
if (r < 0) {
log_debug_errno(r, "Failed to open journal file %s: %m", path);
goto finish;

View File

@ -63,6 +63,20 @@ static const BaseFilesystem table[] = {
"usr/lib64\0", "ld-linux-x86-64.so.2" },
# define KNOW_LIB64_DIRS 1
#elif defined(__ia64__)
#elif defined(__loongarch64)
# define KNOW_LIB64_DIRS 1
# if defined(__loongarch_double_float)
{ "lib64", 0, "usr/lib/"LIB_ARCH_TUPLE"\0"
"usr/lib64\0", "ld-linux-loongarch-lp64d.so.1" },
# elif defined(__loongarch_single_float)
{ "lib64", 0, "usr/lib/"LIB_ARCH_TUPLE"\0"
"usr/lib64\0", "ld-linux-loongarch-lp64f.so.1" },
# elif defined(__loongarch_soft_float)
{ "lib64", 0, "usr/lib/"LIB_ARCH_TUPLE"\0"
"usr/lib64\0", "ld-linux-loongarch-lp64s.so.1" },
# else
# error "Unknown LoongArch ABI"
# endif
#elif defined(__m68k__)
/* No link needed. */
# define KNOW_LIB64_DIRS 1

View File

@ -70,6 +70,8 @@
#include "version.h"
#define WORKER_NUM_MAX 2048U
#define EVENT_RETRY_INTERVAL_USEC (200 * USEC_PER_MSEC)
#define EVENT_RETRY_TIMEOUT_USEC (3 * USEC_PER_MINUTE)
static bool arg_debug = false;
static int arg_daemonize = false;
@ -124,10 +126,12 @@ typedef struct Event {
EventState state;
sd_device *dev;
sd_device *dev_kernel; /* clone of originally received device */
sd_device_action_t action;
uint64_t seqnum;
uint64_t blocker_seqnum;
usec_t retry_again_next_usec;
usec_t retry_again_timeout_usec;
sd_event_source *timeout_warning_event;
sd_event_source *timeout_event;
@ -152,8 +156,13 @@ typedef struct Worker {
} Worker;
/* passed from worker to main process */
typedef struct WorkerMessage {
} WorkerMessage;
typedef enum EventResult {
EVENT_RESULT_SUCCESS,
EVENT_RESULT_FAILED,
EVENT_RESULT_TRY_AGAIN, /* when the block device is locked by another process. */
_EVENT_RESULT_MAX,
_EVENT_RESULT_INVALID = -EINVAL,
} EventResult;
static Event *event_free(Event *event) {
if (!event)
@ -163,7 +172,6 @@ static Event *event_free(Event *event) {
LIST_REMOVE(event, event->manager->events, event);
sd_device_unref(event->dev);
sd_device_unref(event->dev_kernel);
sd_event_source_disable_unref(event->timeout_warning_event);
sd_event_source_disable_unref(event->timeout_event);
@ -346,10 +354,73 @@ static int on_kill_workers_event(sd_event_source *s, uint64_t usec, void *userda
return 1;
}
static int worker_send_message(int fd) {
WorkerMessage message = {};
static void device_broadcast(sd_device_monitor *monitor, sd_device *dev) {
int r;
return loop_write(fd, &message, sizeof(message), false);
assert(dev);
/* On exit, manager->monitor is already NULL. */
if (!monitor)
return;
r = device_monitor_send_device(monitor, NULL, dev);
if (r < 0)
log_device_warning_errno(dev, r,
"Failed to broadcast event to libudev listeners, ignoring: %m");
}
static int worker_send_result(Manager *manager, EventResult result) {
assert(manager);
assert(manager->worker_watch[WRITE_END] >= 0);
return loop_write(manager->worker_watch[WRITE_END], &result, sizeof(result), false);
}
static int device_get_block_device(sd_device *dev, const char **ret) {
const char *val;
int r;
assert(dev);
assert(ret);
if (device_for_action(dev, SD_DEVICE_REMOVE))
goto irrelevant;
r = sd_device_get_subsystem(dev, &val);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get subsystem: %m");
if (!streq(val, "block"))
goto irrelevant;
r = sd_device_get_sysname(dev, &val);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get sysname: %m");
if (STARTSWITH_SET(val, "dm-", "md", "drbd"))
goto irrelevant;
r = sd_device_get_devtype(dev, &val);
if (r < 0 && r != -ENOENT)
return log_device_debug_errno(dev, r, "Failed to get devtype: %m");
if (r >= 0 && streq(val, "partition")) {
r = sd_device_get_parent(dev, &dev);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get parent device: %m");
}
r = sd_device_get_devname(dev, &val);
if (r == -ENOENT)
goto irrelevant;
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get devname: %m");
*ret = val;
return 1;
irrelevant:
*ret = NULL;
return 0;
}
static int worker_lock_block_device(sd_device *dev, int *ret_fd) {
@ -365,44 +436,21 @@ static int worker_lock_block_device(sd_device *dev, int *ret_fd) {
* event handling; in the case udev acquired the lock, the external process can block until udev has
* finished its event handling. */
if (device_for_action(dev, SD_DEVICE_REMOVE))
return 0;
r = sd_device_get_subsystem(dev, &val);
r = device_get_block_device(dev, &val);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get subsystem: %m");
if (!streq(val, "block"))
return 0;
r = sd_device_get_sysname(dev, &val);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get sysname: %m");
if (STARTSWITH_SET(val, "dm-", "md", "drbd"))
return 0;
r = sd_device_get_devtype(dev, &val);
if (r < 0 && r != -ENOENT)
return log_device_debug_errno(dev, r, "Failed to get devtype: %m");
if (r >= 0 && streq(val, "partition")) {
r = sd_device_get_parent(dev, &dev);
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get parent device: %m");
}
r = sd_device_get_devname(dev, &val);
if (r == -ENOENT)
return 0;
if (r < 0)
return log_device_debug_errno(dev, r, "Failed to get devname: %m");
return r;
if (r == 0)
goto nolock;
fd = open(val, O_RDONLY|O_CLOEXEC|O_NOFOLLOW|O_NONBLOCK);
if (fd < 0) {
bool ignore = ERRNO_IS_DEVICE_ABSENT(errno);
log_device_debug_errno(dev, errno, "Failed to open '%s'%s: %m", val, ignore ? ", ignoring" : "");
return ignore ? 0 : -errno;
if (!ignore)
return -errno;
goto nolock;
}
if (flock(fd, LOCK_SH|LOCK_NB) < 0)
@ -410,6 +458,10 @@ static int worker_lock_block_device(sd_device *dev, int *ret_fd) {
*ret_fd = TAKE_FD(fd);
return 1;
nolock:
*ret_fd = -1;
return 0;
}
static int worker_mark_block_device_read_only(sd_device *dev) {
@ -476,44 +528,12 @@ static int worker_process_device(Manager *manager, sd_device *dev) {
if (!udev_event)
return -ENOMEM;
/* If this is a block device and the device is locked currently via the BSD advisory locks,
* someone else is using it exclusively. We don't run our udev rules now to not interfere.
* Instead of processing the event, we requeue the event and will try again after a delay.
*
* The user-facing side of this: https://systemd.io/BLOCK_DEVICE_LOCKING */
r = worker_lock_block_device(dev, &fd_lock);
if (r == -EAGAIN) {
/* So this is a block device and the device is locked currently via the BSD advisory locks —
* someone else is exclusively using it. This means we don't run our udev rules now, to not
* interfere. However we want to know when the device is unlocked again, and retrigger the
* device again then, so that the rules are run eventually. For that we use IN_CLOSE_WRITE
* inotify watches (which isn't exactly the same as waiting for the BSD locks to release, but
* not totally off, as long as unlock+close() is done together, as it usually is).
*
* (The user-facing side of this: https://systemd.io/BLOCK_DEVICE_LOCKING)
*
* There's a bit of a chicken and egg problem here for this however: inotify watching is
* supposed to be enabled via an option set via udev rules (OPTIONS+="watch"). If we skip the
* udev rules here however (as we just said we do), we would thus never see that specific
* udev rule, and thus never turn on inotify watching. But in order to catch up eventually
* and run them we need the inotify watching: hence a classic chicken and egg problem.
*
* Our way out here: if we see the block device locked, unconditionally watch the device via
* inotify, regardless of any explicit request via OPTIONS+="watch". Thus, a device that is
* currently locked via the BSD file locks will be treated as if we ran a single udev rule
* only for it: the one that turns on inotify watching for it. If we eventually see the
* inotify IN_CLOSE_WRITE event, and then run the rules after all and we then realize that
* this wasn't actually requested (i.e. no OPTIONS+="watch" set) we'll simply turn off the
* watching again (see below). Effectively this means: inotify watching is now enabled either
* a) when the udev rules say so, or b) while the device is locked.
*
* Worst case scenario hence: in the (unlikely) case someone locked the device and we clash
* with that we might do inotify watching for a brief moment for a device where we actually
* weren't supposed to. But that shouldn't be too bad, in particular as BSD locks being taken
* on a block device is kinda an indication that the inotify logic is desired too, to some
* degree they go hand-in-hand after all. */
log_device_debug(dev, "Block device is currently locked, installing watch to wait until the lock is released.");
(void) udev_watch_begin(manager->inotify_fd, dev);
/* Now the watch is installed, let's lock the device again, maybe in the meantime things changed */
r = worker_lock_block_device(dev, &fd_lock);
}
if (r < 0)
return r;
@ -546,27 +566,29 @@ static int worker_process_device(Manager *manager, sd_device *dev) {
static int worker_device_monitor_handler(sd_device_monitor *monitor, sd_device *dev, void *userdata) {
Manager *manager = userdata;
EventResult result;
int r;
assert(dev);
assert(manager);
r = worker_process_device(manager, dev);
if (r == -EAGAIN)
/* if we couldn't acquire the flock(), then proceed quietly */
log_device_debug_errno(dev, r, "Device currently locked, not processing.");
else {
if (r < 0)
log_device_warning_errno(dev, r, "Failed to process device, ignoring: %m");
if (r == -EAGAIN) {
/* if we couldn't acquire the flock(), then requeue the event */
result = EVENT_RESULT_TRY_AGAIN;
log_device_debug_errno(dev, r, "Block device is currently locked, requeueing the event.");
} else if (r < 0) {
result = EVENT_RESULT_FAILED;
log_device_warning_errno(dev, r, "Failed to process device, ignoring: %m");
} else
result = EVENT_RESULT_SUCCESS;
if (result != EVENT_RESULT_TRY_AGAIN)
/* send processed event back to libudev listeners */
r = device_monitor_send_device(monitor, NULL, dev);
if (r < 0)
log_device_warning_errno(dev, r, "Failed to send device, ignoring: %m");
}
device_broadcast(monitor, dev);
/* send udevd the result of the event execution */
r = worker_send_message(manager->worker_watch[WRITE_END]);
r = worker_send_result(manager, result);
if (r < 0)
log_device_warning_errno(dev, r, "Failed to send signal to main daemon, ignoring: %m");
@ -782,6 +804,17 @@ static int event_is_blocked(Event *event) {
assert(event->manager);
assert(event->blocker_seqnum <= event->seqnum);
if (event->retry_again_next_usec > 0) {
usec_t now_usec;
r = sd_event_now(event->manager->event, clock_boottime_or_monotonic(), &now_usec);
if (r < 0)
return r;
if (event->retry_again_next_usec <= now_usec)
return true;
}
if (event->blocker_seqnum == event->seqnum)
/* we have checked previously and no blocker found */
return false;
@ -953,16 +986,12 @@ static int event_queue_start(Manager *manager) {
r = event_is_blocked(event);
if (r > 0)
continue;
if (r < 0) {
sd_device_action_t a = _SD_DEVICE_ACTION_INVALID;
(void) sd_device_get_action(event->dev, &a);
if (r < 0)
log_device_warning_errno(event->dev, r,
"Failed to check dependencies for event (SEQNUM=%"PRIu64", ACTION=%s), "
"assuming there is no blocking event, ignoring: %m",
event->seqnum,
strna(device_action_to_string(a)));
}
strna(device_action_to_string(event->action)));
r = event_run(event);
if (r <= 0) /* 0 means there are no idle workers. Let's escape from the loop. */
@ -972,10 +1001,81 @@ static int event_queue_start(Manager *manager) {
return 0;
}
static int event_requeue(Event *event) {
usec_t now_usec;
int r;
assert(event);
assert(event->manager);
assert(event->manager->event);
event->timeout_warning_event = sd_event_source_disable_unref(event->timeout_warning_event);
event->timeout_event = sd_event_source_disable_unref(event->timeout_event);
/* add a short delay to suppress busy loop */
r = sd_event_now(event->manager->event, clock_boottime_or_monotonic(), &now_usec);
if (r < 0)
return log_device_warning_errno(event->dev, r,
"Failed to get current time, "
"skipping event (SEQNUM=%"PRIu64", ACTION=%s): %m",
event->seqnum, strna(device_action_to_string(event->action)));
if (event->retry_again_timeout_usec > 0 && event->retry_again_timeout_usec <= now_usec)
return log_device_warning_errno(event->dev, SYNTHETIC_ERRNO(ETIMEDOUT),
"The underlying block device is locked by a process more than %s, "
"skipping event (SEQNUM=%"PRIu64", ACTION=%s).",
FORMAT_TIMESPAN(EVENT_RETRY_TIMEOUT_USEC, USEC_PER_MINUTE),
event->seqnum, strna(device_action_to_string(event->action)));
event->retry_again_next_usec = usec_add(now_usec, EVENT_RETRY_INTERVAL_USEC);
if (event->retry_again_timeout_usec == 0)
event->retry_again_timeout_usec = usec_add(now_usec, EVENT_RETRY_TIMEOUT_USEC);
if (event->worker && event->worker->event == event)
event->worker->event = NULL;
event->worker = NULL;
event->state = EVENT_QUEUED;
return 0;
}
static int event_queue_assume_block_device_unlocked(Manager *manager, sd_device *dev) {
const char *devname;
int r;
/* When a new event for a block device is queued or we get an inotify event, assume that the
* device is not locked anymore. The assumption may not be true, but that should not cause any
* issues, as in that case events will be requeued soon. */
r = device_get_block_device(dev, &devname);
if (r <= 0)
return r;
LIST_FOREACH(event, event, manager->events) {
const char *event_devname;
if (event->state != EVENT_QUEUED)
continue;
if (event->retry_again_next_usec == 0)
continue;
if (device_get_block_device(event->dev, &event_devname) <= 0)
continue;
if (!streq(devname, event_devname))
continue;
event->retry_again_next_usec = 0;
}
return 0;
}
static int event_queue_insert(Manager *manager, sd_device *dev) {
_cleanup_(sd_device_unrefp) sd_device *clone = NULL;
Event *event;
sd_device_action_t action;
uint64_t seqnum;
Event *event;
int r;
assert(manager);
@ -989,12 +1089,7 @@ static int event_queue_insert(Manager *manager, sd_device *dev) {
if (r < 0)
return r;
/* Save original device to restore the state on failures. */
r = device_shallow_clone(dev, &clone);
if (r < 0)
return r;
r = device_copy_properties(clone, dev);
r = sd_device_get_action(dev, &action);
if (r < 0)
return r;
@ -1005,8 +1100,8 @@ static int event_queue_insert(Manager *manager, sd_device *dev) {
*event = (Event) {
.manager = manager,
.dev = sd_device_ref(dev),
.dev_kernel = TAKE_PTR(clone),
.seqnum = seqnum,
.action = action,
.state = EVENT_QUEUED,
};
@ -1039,6 +1134,8 @@ static int on_uevent(sd_device_monitor *monitor, sd_device *dev, void *userdata)
return 1;
}
(void) event_queue_assume_block_device_unlocked(manager, dev);
/* we have fresh events, try to schedule them */
event_queue_start(manager);
@ -1051,11 +1148,8 @@ static int on_worker(sd_event_source *s, int fd, uint32_t revents, void *userdat
assert(manager);
for (;;) {
WorkerMessage msg;
struct iovec iovec = {
.iov_base = &msg,
.iov_len = sizeof(msg),
};
EventResult result;
struct iovec iovec = IOVEC_MAKE(&result, sizeof(result));
CMSG_BUFFER_TYPE(CMSG_SPACE(sizeof(struct ucred))) control;
struct msghdr msghdr = {
.msg_iov = &iovec,
@ -1078,7 +1172,7 @@ static int on_worker(sd_event_source *s, int fd, uint32_t revents, void *userdat
cmsg_close_all(&msghdr);
if (size != sizeof(WorkerMessage)) {
if (size != sizeof(EventResult)) {
log_warning("Ignoring worker message with invalid size %zi bytes", size);
continue;
}
@ -1103,6 +1197,11 @@ static int on_worker(sd_event_source *s, int fd, uint32_t revents, void *userdat
worker->state = WORKER_IDLE;
/* worker returned */
if (result == EVENT_RESULT_TRY_AGAIN &&
event_requeue(worker->event) < 0)
device_broadcast(manager->monitor, worker->event->dev);
/* When event_requeue() succeeds, worker->event is NULL, and event_free() handles NULL gracefully. */
event_free(worker->event);
}
@ -1365,8 +1464,10 @@ static int on_inotify(sd_event_source *s, int fd, uint32_t revents, void *userda
continue;
log_device_debug(dev, "Inotify event: %x for %s", e->mask, devnode);
if (e->mask & IN_CLOSE_WRITE)
if (e->mask & IN_CLOSE_WRITE) {
(void) event_queue_assume_block_device_unlocked(manager, dev);
(void) synthesize_change(dev);
}
/* Do not handle IN_IGNORED here. It should be handled by worker in 'remove' uevent;
* udev_event_execute_rules() -> event_execute_rules_on_remove() -> udev_watch_end(). */
@ -1439,13 +1540,8 @@ static int on_sigchld(sd_event_source *s, const struct signalfd_siginfo *si, voi
device_delete_db(worker->event->dev);
device_tag_index(worker->event->dev, NULL, false);
if (manager->monitor) {
/* Forward kernel event unchanged */
r = device_monitor_send_device(manager->monitor, NULL, worker->event->dev_kernel);
if (r < 0)
log_device_warning_errno(worker->event->dev_kernel, r,
"Failed to broadcast failed event to libudev listeners, ignoring: %m");
}
/* Forward kernel event to libudev listeners */
device_broadcast(manager->monitor, worker->event->dev);
}
worker_free(worker);
@ -1469,8 +1565,15 @@ static int on_post(sd_event_source *s, void *userdata) {
assert(manager);
if (!LIST_IS_EMPTY(manager->events))
if (!LIST_IS_EMPTY(manager->events)) {
/* Try to process pending events if idle workers exist. Why is this necessary?
* When a worker finished an event and became idle, even if there was a pending event,
* the corresponding device might have been locked and the processing of the event
* delayed for a while, preventing the worker from processing the event immediately.
* Now, the device may be unlocked. Let's try again! */
event_queue_start(manager);
return 1;
}
/* There are no queued events. Let's remove /run/udev/queue and clean up the idle processes. */

View File

@ -119,6 +119,7 @@ test_run() {
# Execute each currently defined function starting with "testcase_"
for testcase in "${TESTCASES[@]}"; do
_image_cleanup
echo "------ $testcase: BEGIN ------"
# Note for my future frustrated self: `fun && xxx` (as well as ||, if, while,
# until, etc.) _DISABLES_ the `set -e` behavior in _ALL_ nested function
@ -129,14 +130,8 @@ test_run() {
# So, be careful when adding clean up snippets in the testcase_*() functions -
# if the `test_run_one()` function isn't the last command, you have propagate
# the exit code correctly (e.g. `test_run_one() || return $?`, see below).
# FIXME: temporary workaround for intermittent fails in certain tests
# See: https://github.com/systemd/systemd/issues/21819
for ((_i = 0; _i < 3; _i++)); do
_image_cleanup
ec=0
"$testcase" "$test_id" && break || ec=$?
done
ec=0
"$testcase" "$test_id" || ec=$?
case $ec in
0)
passed+=("$testcase")

View File

@ -141,6 +141,7 @@ BASICTOOLS=(
echo
env
false
flock
getconf
getent
getfacl

View File

@ -148,6 +148,42 @@ helper_wait_for_pvscan() {
return 1
}
# Generate an `flock` command line for a device list
#
# This is useful mainly for mkfs.btrfs, which doesn't hold the lock on each
# device for the entire duration of mkfs.btrfs, causing weird races between udev
# and mkfs.btrfs. This function creates an array of chained flock calls to take
# the lock of all involved devices, which can be then used in combination with
# mkfs.btrfs to mitigate the issue.
#
# For example, calling:
# helper_generate_flock_cmdline my_array /dev/loop1 /dev/loop2 /dev/loop3
#
# will result in "${my_array[@]}" containing:
# flock -x /dev/loop1 flock -x /dev/loop2 flock -x /dev/loop3
#
# Note: the array will be CLEARED before the first assignment
#
# Arguments:
# $1 - NAME of an array in which the commands/argument will be stored
# $2-$n - path to devices
helper_generate_flock_cmdline() {
# Create a name reference to the array passed as the first argument
# (requires bash 4.3+)
local -n cmd_array="${1:?}"
shift
if [[ $# -eq 0 ]]; then
echo >&2 "Missing argument(s): device path(s)"
return 1
fi
cmd_array=()
for dev in "$@"; do
cmd_array+=("flock" "-x" "$dev")
done
}
testcase_megasas2_basic() {
lsblk -S
[[ "$(lsblk --scsi --noheadings | wc -l)" -ge 128 ]]
@ -388,6 +424,7 @@ testcase_lvm_basic() {
testcase_btrfs_basic() {
local dev_stub i label mpoint uuid
local flock_cmd=()
local devices=(
/dev/disk/by-id/ata-foobar_deadbeefbtrfs{0..3}
)
@ -397,7 +434,8 @@ testcase_btrfs_basic() {
echo "Single device: default settings"
uuid="deadbeef-dead-dead-beef-000000000000"
label="btrfs_root"
mkfs.btrfs -L "$label" -U "$uuid" "${devices[0]}"
helper_generate_flock_cmdline flock_cmd "${devices[0]}"
"${flock_cmd[@]}" mkfs.btrfs -L "$label" -U "$uuid" "${devices[0]}"
udevadm settle
btrfs filesystem show
test -e "/dev/disk/by-uuid/$uuid"
@ -416,7 +454,9 @@ name="diskpart3", size=85M
name="diskpart4", size=85M
EOF
udevadm settle
mkfs.btrfs -d single -m raid1 -L "$label" -U "$uuid" /dev/disk/by-partlabel/diskpart{1..4}
# We need to flock only the device itself, not its partitions
helper_generate_flock_cmdline flock_cmd "${devices[0]}"
"${flock_cmd[@]}" mkfs.btrfs -d single -m raid1 -L "$label" -U "$uuid" /dev/disk/by-partlabel/diskpart{1..4}
udevadm settle
btrfs filesystem show
test -e "/dev/disk/by-uuid/$uuid"
@ -427,7 +467,8 @@ EOF
echo "Multiple devices: using disks, data: raid10, metadata: raid10, mixed mode"
uuid="deadbeef-dead-dead-beef-000000000002"
label="btrfs_mdisk"
mkfs.btrfs -M -d raid10 -m raid10 -L "$label" -U "$uuid" "${devices[@]}"
helper_generate_flock_cmdline flock_cmd "${devices[@]}"
"${flock_cmd[@]}" mkfs.btrfs -M -d raid10 -m raid10 -L "$label" -U "$uuid" "${devices[@]}"
udevadm settle
btrfs filesystem show
test -e "/dev/disk/by-uuid/$uuid"
@ -464,7 +505,8 @@ EOF
# Check if we have all necessary DM devices
ls -l /dev/mapper/encbtrfs{0..3}
# Create a multi-device btrfs filesystem on the LUKS devices
mkfs.btrfs -M -d raid1 -m raid1 -L "$label" -U "$uuid" /dev/mapper/encbtrfs{0..3}
helper_generate_flock_cmdline flock_cmd /dev/mapper/encbtrfs{0..3}
"${flock_cmd[@]}" mkfs.btrfs -M -d raid1 -m raid1 -L "$label" -U "$uuid" /dev/mapper/encbtrfs{0..3}
udevadm settle
btrfs filesystem show
test -e "/dev/disk/by-uuid/$uuid"