Compare commits

...

10 Commits

Author SHA1 Message Date
Yu Watanabe 6536b13cb8
Merge ff7ff2d008 into 5261c521e3 2024-11-08 13:30:15 +00:00
Yu Watanabe 5261c521e3 mount-util: make path_get_mount_info() work arbitrary inode
Follow-up for d49d95df0a.
Replaces 9a032ec55a.
Fixes #35075.
2024-11-08 13:25:17 +01:00
Franck Bui 514d9e1665 test: install integration-test-setup.sh in testdata/
integration-test-setup.sh is an auxiliary script that tests rely on at
runtime. As such, install the script in testdata/.

Follow-up for af153e36ae.
2024-11-08 12:37:40 +01:00
Lennart Poettering b480a4c15e update TODO 2024-11-08 10:10:11 +01:00
Lennart Poettering af3baf174a fs-util: add comment about XO_NOCOW 2024-11-08 09:21:25 +01:00
Ryan Wilson d8091e1281 Fix PrivatePIDs=yes integration test for kernels with no /proc/scsi 2024-11-08 13:38:35 +09:00
Yu Watanabe ff7ff2d008 test: add test case for mDNS transaction 2024-10-07 15:26:37 +09:00
Yu Watanabe 5cc7af539c resolve: also log sender port on receive 2024-10-07 15:26:37 +09:00
Yu Watanabe f0cabbe292 resolve/mdns: source port of mDNS replies must be 5353
RFC 6762 section 6:
The source UDP port in all Multicast DNS responses MUST be 5353 (the well-known port
assigned to mDNS). Multicast DNS implementations MUST silently ignore any Multicast DNS
responses they receive where the source UDP port is not 5353.

Prompted by #33806.
2024-10-07 15:26:37 +09:00
Yu Watanabe 3093ace2ff Revert "systemd.dnssd does not handle local requests (#32991)"
This reverts commit a2ae7ed7d0.

The commit causes issue #33806.
Reopening issue #32990.
Fixes #33806.
2024-10-07 15:26:37 +09:00
11 changed files with 235 additions and 83 deletions

25
TODO
View File

@ -129,6 +129,10 @@ Deprecations and removals:
Features:
* format-table: introduce new cell type for strings with ansi sequences in
them. display them in regular output mode (via strip_tab_ansi()), but
suppress them in json mode.
* machined: when registering a machine, also take a relative cgroup path,
relative to the machine's unit. This is useful when registering unpriv
machines, as they might sit down the cgroup tree, below a cgroup delegation
@ -217,12 +221,8 @@ Features:
services where mount propagation from the root fs is off, an still have
confext/sysext propagated in.
* support F_DUDFD_QUERY for comparing fds in same_fd (requires kernel 6.10)
* generic interface for varlink for setting log level and stuff that all our daemons can implement
* use pty ioctl to get peer wherever possible (TIOCGPTPEER)
* maybe teach repart.d/ dropins a new setting MakeMountNodes= or so, which is
just like MakeDirectories=, but uses an access mode of 0000 and sets the +i
chattr bit. This is useful as protection against early uses of /var/ or /tmp/
@ -253,8 +253,6 @@ Features:
* initrd: when transitioning from initrd to host, validate that
/lib/modules/`uname -r` exists, refuse otherwise
* tmpfiles: add "owning" flag for lines that limits effect of --purge
* signed bpf loading: to address need for signature verification for bpf
programs when they are loaded, and given the bpf folks don't think this is
realistic in kernel space, maybe add small daemon that facilitates this
@ -458,9 +456,6 @@ Features:
* introduce mntid_t, and make it 64bit, as apparently the kernel switched to
64bit mount ids
* use udev rule networkd ownership property to take ownership of network
interfaces nspawn creates
* mountfsd/nsresourced
- userdb: maybe allow callers to map one uid to their own uid
- bpflsm: allow writes if resulting UID on disk would be userns' owner UID
@ -647,6 +642,7 @@ Features:
- openpt_allocate_in_namespace()
- unit_attach_pid_to_cgroup_via_bus()
- cg_attach() requires new kernel feature
- journald's process cache
* ddi must be listed as block device fstype
@ -1470,9 +1466,6 @@ Features:
* in sd-id128: also parse UUIDs in RFC4122 URN syntax (i.e. chop off urn:uuid: prefix)
* DynamicUser= + StateDirectory= → use uid mapping mounts, too, in order to
make dirs appear under right UID.
* systemd-sysext: optionally, run it in initrd already, before transitioning
into host, to open up possibility for services shipped like that.
@ -1644,14 +1637,6 @@ Features:
* maybe add kernel cmdline params: to force random seed crediting
* introduce a new per-process uuid, similar to the boot id, the machine id, the
invocation id, that is derived from process creds, specifically a hashed
combination of AT_RANDOM + getpid() + the starttime from
/proc/self/status. Then add these ids implicitly when logging. Deriving this
uuid from these three things has the benefit that it can be derived easily
from /proc/$PID/ in a stable, and unique way that changes on both fork() and
exec().
* let's not GC a unit while its ratelimits are still pending
* when killing due to service watchdog timeout maybe detect whether target

View File

@ -1131,6 +1131,8 @@ int xopenat_full(int dir_fd, const char *path, int open_flags, XOpenFlags xopen_
* If O_CREAT is used with XO_LABEL, any created file will be immediately relabelled.
*
* If the path is specified NULL or empty, behaves like fd_reopen().
*
* If XO_NOCOW is specified will turn on the NOCOW btrfs flag on the file, if available.
*/
if (isempty(path)) {

View File

@ -963,9 +963,9 @@ int manager_recv(Manager *m, int fd, DnsProtocol protocol, DnsPacket **ret) {
p->ifindex = manager_find_ifindex(m, p->family, &p->destination);
}
log_debug("Received %s UDP packet of size %zu, ifindex=%i, ttl=%u, fragsize=%zu, sender=%s, destination=%s",
log_debug("Received %s UDP packet of size %zu, ifindex=%i, ttl=%u, fragsize=%zu, sender=%s, sender_port=%u, destination=%s",
dns_protocol_to_string(protocol), p->size, p->ifindex, p->ttl, p->fragsize,
IN_ADDR_TO_STRING(p->family, &p->sender),
IN_ADDR_TO_STRING(p->family, &p->sender), p->sender_port,
IN_ADDR_TO_STRING(p->family, &p->destination));
*ret = TAKE_PTR(p);

View File

@ -385,10 +385,7 @@ static int on_mdns_packet(sd_event_source *s, int fd, uint32_t revents, void *us
if (r <= 0)
return r;
/* Refuse traffic from the local host, to avoid query loops. However, allow legacy mDNS
* unicast queries through anyway (we never send those ourselves, hence no risk).
* i.e. check for the source port nr. */
if (p->sender_port == MDNS_PORT && manager_packet_from_local_address(m, p))
if (manager_packet_from_local_address(m, p))
return 0;
scope = manager_find_scope(m, p);
@ -400,6 +397,15 @@ static int on_mdns_packet(sd_event_source *s, int fd, uint32_t revents, void *us
if (dns_packet_validate_reply(p) > 0) {
DnsResourceRecord *rr;
/* RFC 6762 section 6:
* The source UDP port in all Multicast DNS responses MUST be 5353 (the well-known port
* assigned to mDNS). Multicast DNS implementations MUST silently ignore any Multicast DNS
* responses they receive where the source UDP port is not 5353. */
if (p->sender_port != MDNS_PORT) {
log_debug("Received mDNS reply packet from port %u (not %i), ignoring.", p->sender_port, MDNS_PORT);
return 0;
}
log_debug("Got mDNS reply packet");
/*

View File

@ -1808,40 +1808,50 @@ char* umount_and_unlink_and_free(char *p) {
return mfree(p);
}
static int path_get_mount_info(
static int path_get_mount_info_at(
int dir_fd,
const char *path,
char **ret_fstype,
char **ret_options) {
_cleanup_(mnt_free_tablep) struct libmnt_table *table = NULL;
_cleanup_free_ char *fstype = NULL, *options = NULL;
struct libmnt_fs *fs;
int r;
_cleanup_(mnt_free_iterp) struct libmnt_iter *iter = NULL;
int r, mnt_id;
assert(path);
assert(dir_fd >= 0 || dir_fd == AT_FDCWD);
table = mnt_new_table();
if (!table)
return -ENOMEM;
r = mnt_table_parse_mtab(table, /* filename = */ NULL);
r = path_get_mnt_id_at(dir_fd, path, &mnt_id);
if (r < 0)
return r;
return log_debug_errno(r, "Failed to get mount ID: %m");
fs = mnt_table_find_mountpoint(table, path, MNT_ITER_FORWARD);
if (!fs)
return -EINVAL;
r = libmount_parse("/proc/self/mountinfo", NULL, &table, &iter);
if (r < 0)
return log_debug_errno(r, "Failed to parse /proc/self/mountinfo: %m");
for (;;) {
struct libmnt_fs *fs;
r = mnt_table_next_fs(table, iter, &fs);
if (r == 1)
break; /* EOF */
if (r < 0)
return log_debug_errno(r, "Failed to get next entry from /proc/self/mountinfo: %m");
if (mnt_fs_get_id(fs) != mnt_id)
continue;
_cleanup_free_ char *fstype = NULL, *options = NULL;
if (ret_fstype) {
fstype = strdup(strempty(mnt_fs_get_fstype(fs)));
if (!fstype)
return -ENOMEM;
return log_oom_debug();
}
if (ret_options) {
options = strdup(strempty(mnt_fs_get_options(fs)));
if (!options)
return -ENOMEM;
return log_oom_debug();
}
if (ret_fstype)
@ -1852,19 +1862,27 @@ static int path_get_mount_info(
return 0;
}
int path_is_network_fs_harder(const char *path) {
return log_debug_errno(SYNTHETIC_ERRNO(ESTALE), "Cannot find mount ID %i from /proc/self/mountinfo.", mnt_id);
}
int path_is_network_fs_harder_at(int dir_fd, const char *path) {
_cleanup_close_ int fd = -EBADF;
int r;
assert(dir_fd >= 0 || dir_fd == AT_FDCWD);
fd = xopenat(dir_fd, path, O_PATH | O_CLOEXEC | O_NOFOLLOW);
if (fd < 0)
return fd;
r = fd_is_network_fs(fd);
if (r != 0)
return r;
_cleanup_free_ char *fstype = NULL, *options = NULL;
int r, ret;
assert(path);
ret = path_is_network_fs(path);
if (ret > 0)
return true;
r = path_get_mount_info(path, &fstype, &options);
r = path_get_mount_info_at(fd, /* path = */ NULL, &fstype, &options);
if (r < 0)
return RET_GATHER(ret, r);
return r;
if (fstype_is_network(fstype))
return true;

View File

@ -181,4 +181,7 @@ int mount_credentials_fs(const char *path, size_t size, bool ro);
int make_fsmount(int error_log_level, const char *what, const char *type, unsigned long flags, const char *options, int userns_fd);
int path_is_network_fs_harder(const char *path);
int path_is_network_fs_harder_at(int dir_fd, const char *path);
static inline int path_is_network_fs_harder(const char *path) {
return path_is_network_fs_harder_at(AT_FDCWD, path);
}

View File

@ -538,9 +538,53 @@ TEST(bind_mount_submounts) {
}
TEST(path_is_network_fs_harder) {
ASSERT_OK_ZERO(path_is_network_fs_harder("/dev"));
ASSERT_OK_ZERO(path_is_network_fs_harder("/sys"));
ASSERT_OK_ZERO(path_is_network_fs_harder("/run"));
_cleanup_close_ int dir_fd = -EBADF;
int r;
ASSERT_OK(dir_fd = open("/", O_PATH | O_CLOEXEC));
FOREACH_STRING(s,
"/", "/dev/", "/proc/", "/run/", "/sys/", "/tmp/", "/usr/", "/var/tmp/",
"", ".", "../../../", "/this/path/should/not/exist/for/test-mount-util/") {
r = path_is_network_fs_harder(s);
log_debug("path_is_network_fs_harder(%s) → %i: %s", s, r, r < 0 ? STRERROR(r) : yes_no(r));
const char *q = path_startswith(s, "/") ?: s;
r = path_is_network_fs_harder_at(dir_fd, q);
log_debug("path_is_network_fs_harder_at(root, %s) → %i: %s", q, r, r < 0 ? STRERROR(r) : yes_no(r));
}
if (geteuid() != 0 || have_effective_cap(CAP_SYS_ADMIN) <= 0) {
(void) log_tests_skipped("not running privileged");
return;
}
_cleanup_(rm_rf_physical_and_freep) char *t = NULL;
assert_se(mkdtemp_malloc("/tmp/test-mount-util.path_is_network_fs_harder.XXXXXXX", &t) >= 0);
r = safe_fork("(make_mount-point)",
FORK_RESET_SIGNALS |
FORK_CLOSE_ALL_FDS |
FORK_DEATHSIG_SIGTERM |
FORK_WAIT |
FORK_REOPEN_LOG |
FORK_LOG |
FORK_NEW_MOUNTNS |
FORK_MOUNTNS_SLAVE,
NULL);
ASSERT_OK(r);
if (r == 0) {
ASSERT_OK(mount_nofollow_verbose(LOG_INFO, "tmpfs", t, "tmpfs", 0, NULL));
ASSERT_OK_ZERO(path_is_network_fs_harder(t));
ASSERT_OK_ERRNO(umount(t));
ASSERT_OK(mount_nofollow_verbose(LOG_INFO, "tmpfs", t, "tmpfs", 0, "x-systemd-growfs,x-systemd-automount"));
ASSERT_OK_ZERO(path_is_network_fs_harder(t));
ASSERT_OK_ERRNO(umount(t));
_exit(EXIT_SUCCESS);
}
}
DEFINE_TEST_MAIN(LOG_DEBUG);

View File

@ -142,11 +142,13 @@ endif
############################################################
if install_tests
foreach script : ['integration-test-setup.sh', 'run-unit-tests.py']
install_data(script,
install_data('run-unit-tests.py',
install_mode : 'rwxr-xr-x',
install_dir : testsdir)
endforeach
install_data('integration-test-setup.sh',
install_mode : 'rwxr-xr-x',
install_dir : testdata_dir)
endif
############################################################

View File

@ -7,9 +7,9 @@ Before=getty-pre.target
[Service]
ExecStartPre=rm -f /failed /testok
ExecStartPre=/usr/lib/systemd/tests/integration-test-setup.sh setup
ExecStartPre=/usr/lib/systemd/tests/testdata/integration-test-setup.sh setup
ExecStart=@command@
ExecStopPost=/usr/lib/systemd/tests/integration-test-setup.sh finalize
ExecStopPost=/usr/lib/systemd/tests/testdata/integration-test-setup.sh finalize
Type=oneshot
MemoryAccounting=@memory-accounting@
StateDirectory=%N

View File

@ -132,10 +132,12 @@ testcase_unpriv() {
return 0
fi
# The kernel has a restriction for unprivileged user namespaces where they cannot mount a less restrictive
# instance of /proc/. So if /proc/ is masked (e.g. /proc/kmsg is over-mounted with tmpfs as systemd-nspawn does),
# then mounting a new /proc/ will fail and we will still see the host's /proc/. Thus, to allow tests to run in
# a VM or nspawn, we mount a new proc on a temporary directory with no masking to bypass this kernel restriction.
# IMPORTANT: For /proc/ to be remounted in pid namespace within an unprivileged user namespace, there needs to
# be at least 1 unmasked procfs mount in ANY directory. Otherwise, if /proc/ is masked (e.g. /proc/scsi is
# over-mounted with tmpfs), then mounting a new /proc/ will fail.
#
# Thus, to guarantee PrivatePIDs=yes tests for unprivileged users pass, we mount a new procfs on a temporary
# directory with no masking. This will guarantee an unprivileged user can mount a new /proc/ successfully.
mkdir -p /tmp/TEST-07-PID1-private-pids-proc
mount -t proc proc /tmp/TEST-07-PID1-private-pids-proc
@ -146,7 +148,16 @@ testcase_unpriv() {
umount /tmp/TEST-07-PID1-private-pids-proc
rm -rf /tmp/TEST-07-PID1-private-pids-proc
# Now verify the behavior with masking - units should fail as PrivatePIDs=yes has no graceful fallback.
# Now we will mask /proc/ by mounting tmpfs over /proc/scsi. This will guarantee that mounting /proc/ will fail
# for unprivileged users when using PrivatePIDs=yes. Now units should fail as PrivatePIDs=yes has no graceful
# fallback.
#
# Note some kernels do not have /proc/scsi so we verify the directory exists prior to running the test.
if [ ! -d /proc/scsi ]; then
echo "/proc/scsi does not exist, skipping unprivileged PrivatePIDs=yes test with masked /proc/"
return 0
fi
if [[ "$HAS_EXISTING_SCSI_MOUNT" == "no" ]]; then
mount -t tmpfs tmpfs /proc/scsi
fi

View File

@ -996,6 +996,87 @@ testcase_12_resolvectl2() {
restart_resolved
}
testcase_mdns() {
# For issue #32990 and #33806
# Cleanup
# shellcheck disable=SC2317
cleanup() {
rm -f /run/systemd/resolved.conf.d/enable-mdns.conf
rm -rf /run/systemd/dnssd
ip link del veth99 || :
ip netns del ns99 || :
}
trap cleanup RETURN
mkdir -p /run/systemd/resolved.conf.d
cat >/run/systemd/resolved.conf.d/enable-mdns.conf <<EOF
[Resolve]
MulticastDNS=yes
EOF
mkdir -p /run/systemd/dnssd
cat >/run/systemd/dnssd/ssh.dnssd <<EOF
[Service]
Name=%H
Type=_ssh._tcp
Port=22
TxtText=hogehogehoge
Priority=42
Weight=13
EOF
ip netns add ns99
ip link add veth99 type veth peer name veth-peer
ip link set veth-peer netns ns99
ip link set veth99 up
ip netns exec ns99 ip link set veth-peer up
ip link set veth99 multicast on
ip address add 192.168.0.12/24 dev veth99
ip netns exec ns99 ip address add 192.168.0.10/24 dev veth-peer
assert_in '192.168.0.12/24' "$(ip address show dev veth99)"
assert_in '192.168.0.10/24' "$(ip netns exec ns99 ip address show dev veth-peer)"
# make sure networkd is not running.
systemctl stop systemd-networkd.socket
systemctl stop systemd-networkd.service
# restart resolved and enable mdns on interface veth99
restart_resolved
resolvectl mdns veth99 yes
resolvectl domain veth99 local
assert_in 'Global: yes' "$(resolvectl mdns)"
assert_in 'yes' "$(resolvectl mdns veth99)"
assert_in 'local' "$(resolvectl domain veth99)"
run ip netns exec ns99 dig -p 5353 "ns1.local" @192.168.0.12
grep -qE "ns1\.local\.\s+[0-9]+\s+IN\s+A\s+192\.168\.0\.12" "$RUN_OUT"
run ip netns exec ns99 dig -p 5353 -t SRV "ns1._ssh._tcp.local" @192.168.0.12
grep -qE "ns1\._ssh\._tcp\.local\.\s+[0-9]+\s+IN\s+SRV\s+42\s+13\s+22\s+ns1\.local\." "$RUN_OUT"
run ip netns exec ns99 dig -p 5353 -t TXT "ns1._ssh._tcp.local" @192.168.0.12
grep -qE "ns1\._ssh\._tcp\.local\.\s+[0-9]+\s+IN\s+TXT\s+\"hogehogehoge\"" "$RUN_OUT"
run resolvectl query "ns1.local" || :
grep -qE "ns1.local: " "$RUN_OUT"
grep -qE ".*192\.168\.0\.12\s+-- link: veth99" "$RUN_OUT"
run resolvectl query -t SRV "ns1._ssh._tcp.local" || :
grep -qE "ns1\._ssh\._tcp\.local IN SRV 42 13 22 ns1\.local\s+-- link: veth99" "$RUN_OUT"
run resolvectl query -t TXT "ns1._ssh._tcp.local" || :
grep -qE "ns1\._ssh\._tcp\.local IN TXT \"hogehogehoge\"\s+-- link: veth99" "$RUN_OUT"
run resolvectl service "ns1._ssh._tcp.local" || :
grep -qE "ns1\._ssh\._tcp\.local: ns1\.local:22 \[priority=42, weight=13\]" "$RUN_OUT"
# refuse queries from a local address. See issue #32990 and the comment:
# https://github.com/systemd/systemd/pull/34141#discussion_r1736318656
(! dig -p 5353 "ns1.local" @192.168.0.12)
}
# PRE-SETUP
systemctl unmask systemd-resolved.service
systemctl enable --now systemd-resolved.service