Compare commits

..

5 Commits

Author SHA1 Message Date
Lennart Poettering 437f48a471 tree-wide: fix how we set $TZ
According to tzset(3) we need to prefix timezone names with ":". Let's
do so hence, to avoid any ambiguities and follow documented behaviour.
2019-11-13 12:30:22 +01:00
Zbigniew Jędrzejewski-Szmek d5fc5b2f8d nspawn: do not emit any warning when $UNIFIED_CGROUP_HIERARCHY is used
Initially I thought this is a good idea, but when reviewing a different PR
(https://github.com/systemd/systemd/pull/13862#discussion_r340604313) I changed
my mind about this. At some point we probably should start warning about the
old option name, and yet later remove it. But it'll make it easier for people
to transition to the new option name if there's a period of support for both
names without any fuss. There's nothing particularly wrong about the old name,
and there is no support cost.

Fixes #13919 (by avoiding the issue completely).
2019-11-13 12:21:18 +01:00
Zbigniew Jędrzejewski-Szmek 7b631898ef
Merge pull request #13961 from mwilck/udev-no-exit-timeout
udevd: wait for workers to finish when exiting
2019-11-13 08:56:49 +01:00
Martin Wilck 7b6596d748 udevd: fix crash when workers time out after exit is signal caught
If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.

Fix this by releasing the main process's monitor ref later, in
manager_free().
2019-11-12 16:43:42 +01:00
Martin Wilck bfde9421af udevd: wait for workers to finish when exiting
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.

If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.

If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.

With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
2019-11-12 12:20:20 +01:00
9 changed files with 37 additions and 40 deletions

11
NEWS
View File

@ -2,6 +2,17 @@ systemd System and Service Manager
CHANGES WITH 244 in spe:
* systemd-udevd: removed the 30s timeout for killing stale workers on
exit. systemd-udevd now waits for workers to finish. The hard-coded
exit timeout of 30s was too short for some large installations, where
driver initialization could be prematurely interrupted during initrd
processing if the root file system had been mounted and init was
preparing to switch root. If udevd is run without systemd and workers
are hanging while udevd receives an exit signal, udevd will now exit
when udev.event_timeout is reached for the last hanging worker. With
systemd, the exit timeout can additionally be configured using
TimeoutStopSec= in systemd-udevd.service.
* Support for the cpuset cgroups v2 controller has been added.
Processes may be restricted to specific CPUs using the new
AllowedCPUs= setting, and to specific memory NUMA nodes using the new

View File

@ -136,9 +136,8 @@
evaluated relative to the UNIX time epoch 1st Jan, 1970,
00:00.</para>
<para>Examples for valid timestamps and their normalized form
(assuming the current time was 2012-11-23 18:15:22 and the timezone
was UTC+8, for example TZ=Asia/Shanghai):</para>
<para>Examples for valid timestamps and their normalized form (assuming the current time was 2012-11-23
18:15:22 and the timezone was UTC+8, for example <literal>TZ=:Asia/Shanghai</literal>):</para>
<programlisting> Fri 2012-11-23 11:12:13 → Fri 2012-11-23 11:12:13
2012-11-23 11:12:13 → Fri 2012-11-23 11:12:13

View File

@ -832,8 +832,12 @@ int parse_timestamp(const char *t, usec_t *usec) {
}
if (r == 0) {
bool with_tz = true;
char *colon_tz;
if (setenv("TZ", tz, 1) != 0) {
/* tzset(3) says $TZ should be prefixed with ":" if we reference timezone files */
colon_tz = strjoina(":", tz);
if (setenv("TZ", colon_tz, 1) != 0) {
shared->return_value = negative_errno();
_exit(EXIT_FAILURE);
}

View File

@ -437,14 +437,9 @@ static int detect_unified_cgroup_hierarchy_from_environment(void) {
e = getenv(var);
if (!e) {
static bool warned = false;
/* $UNIFIED_CGROUP_HIERARCHY has been renamed to $SYSTEMD_NSPAWN_UNIFIED_HIERARCHY. */
var = "UNIFIED_CGROUP_HIERARCHY";
e = getenv(var);
if (e && !warned) {
log_info("$UNIFIED_CGROUP_HIERARCHY has been renamed to $SYSTEMD_NSPAWN_UNIFIED_HIERARCHY.");
warned = true;
}
}
if (!isempty(e)) {

View File

@ -1352,7 +1352,12 @@ int calendar_spec_next_usec(const CalendarSpec *spec, usec_t usec, usec_t *ret_n
return r;
}
if (r == 0) {
if (setenv("TZ", spec->timezone, 1) != 0) {
char *colon_tz;
/* tzset(3) says $TZ should be prefixed with ":" if we reference timezone files */
colon_tz = strjoina(":", spec->timezone);
if (setenv("TZ", colon_tz, 1) != 0) {
shared->return_value = negative_errno();
_exit(EXIT_FAILURE);
}

View File

@ -43,9 +43,12 @@ static void test_next(const char *input, const char *new_tz, usec_t after, usec_
if (old_tz)
old_tz = strdupa(old_tz);
if (new_tz)
assert_se(setenv("TZ", new_tz, 1) >= 0);
else
if (new_tz) {
char *colon_tz;
colon_tz = strjoina(":", new_tz);
assert_se(setenv("TZ", colon_tz, 1) >= 0);
} else
assert_se(unsetenv("TZ") >= 0);
tzset();

View File

@ -475,7 +475,7 @@ static void test_in_utc_timezone(void) {
assert_se(timezone == 0);
assert_se(daylight == 0);
assert_se(setenv("TZ", "Europe/Berlin", 1) >= 0);
assert_se(setenv("TZ", ":Europe/Berlin", 1) >= 0);
assert_se(!in_utc_timezone());
assert_se(streq(tzname[0], "CET"));
assert_se(streq(tzname[1], "CEST"));

View File

@ -46,7 +46,7 @@ typedef struct StatusInfo {
} StatusInfo;
static void print_status_info(const StatusInfo *i) {
const char *old_tz = NULL, *tz;
const char *old_tz = NULL, *tz, *tz_colon;
bool have_time = false;
char a[LINE_MAX];
struct tm tm;
@ -62,7 +62,8 @@ static void print_status_info(const StatusInfo *i) {
old_tz = strdupa(tz);
/* Set the new $TZ */
if (setenv("TZ", isempty(i->timezone) ? "UTC" : i->timezone, true) < 0)
tz_colon = strjoina(":", isempty(i->timezone) ? "UTC" : i->timezone);
if (setenv("TZ", tz_colon, true) < 0)
log_warning_errno(errno, "Failed to set TZ environment variable, ignoring: %m");
else
tzset();

View File

@ -293,6 +293,8 @@ static void manager_free(Manager *manager) {
if (!manager)
return;
manager->monitor = sd_device_monitor_unref(manager->monitor);
udev_builtin_exit();
if (manager->pid == getpid_cached())
@ -774,21 +776,7 @@ set_delaying_seqnum:
return true;
}
static int on_exit_timeout(sd_event_source *s, uint64_t usec, void *userdata) {
Manager *manager = userdata;
assert(manager);
log_error("Giving up waiting for workers to finish.");
sd_event_exit(manager->event, -ETIMEDOUT);
return 1;
}
static void manager_exit(Manager *manager) {
uint64_t usec;
int r;
assert(manager);
manager->exit = true;
@ -803,18 +791,9 @@ static void manager_exit(Manager *manager) {
manager->inotify_event = sd_event_source_unref(manager->inotify_event);
manager->fd_inotify = safe_close(manager->fd_inotify);
manager->monitor = sd_device_monitor_unref(manager->monitor);
/* discard queued events and kill workers */
event_queue_cleanup(manager, EVENT_QUEUED);
manager_kill_workers(manager);
assert_se(sd_event_now(manager->event, CLOCK_MONOTONIC, &usec) >= 0);
r = sd_event_add_time(manager->event, NULL, CLOCK_MONOTONIC,
usec + 30 * USEC_PER_SEC, USEC_PER_SEC, on_exit_timeout, manager);
if (r < 0)
return;
}
/* reload requested, HUP signal received, rules changed, builtin changed */