The background of my question is a set of test cases for my Linux-kernel Namespaces discovery Go package lxkns where I create a new child user namespace as well as a new child PID namespace inside a test container. I then need to remount /proc, otherwise I would see the wrong process information and cannot lookup the correct process-related information, such as the namespaces of the test process inside the new child user+PID namespaces (without resorting to guerilla tactics).
The test harness/test setup is essentially this and fails without --privileged
(I'm simplifying to all caps and switching off seccomp and apparmor in order to cut through to the real meat):
docker run -it --rm --name closedboxx --cap-add ALL --security-opt seccomp=unconfined --security-opt apparmor=unconfined busybox unshare -Umpfr mount -t proc /proc proc
mount: permission denied (are you root?)
Of course, the path of least of least resistance as well as least beauty is to use --privileged
, which will get the job done and as this is a throw-away test container (maybe there is beauty in the sheer lack of it).
Recently, I became aware of Docker's --security-opt systempaths=unconfined
, which (afaik) translates into an empty readonlyPaths in the resulting OCI/runc container spec. The following Docker run command succeeds as needed, it just returns silently in the example, so it was carried out correctly:
docker run -it --rm --name closedboxx --cap-add ALL --security-opt seccomp=unconfined --security-opt apparmor=unconfined --security-opt systempaths=unconfined busybox unshare -Umpfr mount -t proc /proc proc
In case of the failing setup, when running without --privilege
and without --security-opt systempaths=unconfined
, the mounts inside the child user and PID namespaces inside the container look as follows:
docker run -it --rm --name closedboxx --cap-add ALL --security-opt seccomp=unconfined --security-opt apparmor=unconfined busybox unshare -Umpfr cat /proc/1/mountinfo
693 678 0:46 / / rw,relatime - overlay overlay rw,lowerdir=/var/lib/docker/overlay2/l/AOY3ZSL2FQEO77CCDBKDOPEK7M:/var/lib/docker/overlay2/l/VNX7PING7ZLTIPXRDFSBMIOKKU,upperdir=/var/lib/docker/overlay2/60e8ad10362e49b621d2f3d603845ee24bda62d6d77de96a37ea0001c8454546/diff,workdir=/var/lib/docker/overlay2/60e8ad10362e49b621d2f3d603845ee24bda62d6d77de96a37ea0001c8454546/work,xino=off
694 693 0:50 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
695 694 0:50 /bus /proc/bus ro,relatime - proc proc rw
696 694 0:50 /fs /proc/fs ro,relatime - proc proc rw
697 694 0:50 /irq /proc/irq ro,relatime - proc proc rw
698 694 0:50 /sys /proc/sys ro,relatime - proc proc rw
699 694 0:50 /sysrq-trigger /proc/sysrq-trigger ro,relatime - proc proc rw
700 694 0:51 /null /proc/kcore rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
701 694 0:51 /null /proc/keys rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
702 694 0:51 /null /proc/latency_stats rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
703 694 0:51 /null /proc/timer_list rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
704 694 0:51 /null /proc/sched_debug rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
705 694 0:56 / /proc/scsi ro,relatime - tmpfs tmpfs ro
706 693 0:51 / /dev rw,nosuid - tmpfs tmpfs rw,size=65536k,mode=755
707 706 0:52 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666
708 706 0:49 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw
709 706 0:55 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,size=65536k
710 706 0:52 /0 /dev/console rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666
711 693 0:53 / /sys ro,nosuid,nodev,noexec,relatime - sysfs sysfs ro
712 711 0:54 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw,mode=755
713 712 0:28 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/systemd ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,xattr,name=systemd
714 712 0:31 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/cpuset ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,cpuset
715 712 0:32 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/net_cls,net_prio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,net_cls,net_prio
716 712 0:33 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,memory
717 712 0:34 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/perf_event ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,perf_event
718 712 0:35 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,devices
719 712 0:36 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/blkio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,blkio
720 712 0:37 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/pids ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,pids
721 712 0:38 / /sys/fs/cgroup/rdma ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,rdma
722 712 0:39 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/freezer ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
723 712 0:40 /docker/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0 /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,cpu,cpuacct
724 711 0:57 / /sys/firmware ro,relatime - tmpfs tmpfs ro
725 693 8:2 /var/lib/docker/containers/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0/resolv.conf /etc/resolv.conf rw,relatime - ext4 /dev/sda2 rw,stripe=256
944 693 8:2 /var/lib/docker/containers/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0/hostname /etc/hostname rw,relatime - ext4 /dev/sda2 rw,stripe=256
1352 693 8:2 /var/lib/docker/containers/eebfacfdc6e0e34c4e62d9f162bdd7c04b232ba2d1f5327eaf7e00011d0235c0/hosts /etc/hosts rw,relatime - ext4 /dev/sda2 rw,stripe=256
- what mechanism exactly is blocking the fresh mount of
procfs
on /proc
?
- what is preventing me from unmounting
/proc/kcore
, etc.?
question from:
https://stackoverflow.com/questions/65917162/how-does-oci-runc-system-path-constraining-work-to-prevent-remounting-such-paths