Mounting Lustre Inside Running Container Not Working (Have Added All Capabilities)

802 views Asked by At

We are trying to mount lustre filesystem inside running container, and have successfully done this via containers which are running in priviledged mode.

However for those containers which are running in non-privilidged mode, mounting lustre failed, even if all capabilites linux provides -- tens of capabilities -- were included!

Then

  1. what is difference between "priviledged: True" and "cap_add: all capabilites"?
  2. Why mounting lustre still fails when all capabilities were added to the container?

Mount Error enter image description here

Non-Privileged Mode Container:

version: "3"
services:
aiart:
cap_add:
  - AUDIT_CONTROL
  - AUDIT_READ
  - AUDIT_WRITE
  - BLOCK_SUSPEND
  - CHOWN
  - DAC_OVERRIDE
  - DAC_READ_SEARCH
  - FOWNER
  - FSETID
  - IPC_LOCK
  - IPC_OWNER
  - KILL
  - LEASE
  - LINUX_IMMUTABLE
  - MAC_ADMIN
  - MAC_OVERRIDE
  - MKNOD
  - NET_ADMIN
  - NET_BIND_SERVICE
  - NET_BROADCAST
  - NET_RAW
  - SETGID
  - SETFCAP
  - SETPCAP
  - SETUID
  - SYS_ADMIN
  - SYS_BOOT
  - SYS_CHROOT
  - SYS_MODULE
  - SYS_NICE
  - SYS_PACCT
  - SYS_PTRACE
  - SYS_RAWIO
  - SYS_RESOURCE
  - SYS_TIME
  - SYS_TTY_CONFIG
  - SYSLOG
  - WAKE_ALARM

image: test_lustre:1.1
#privileged: true
ports:
  - "12345:12345"
volumes:
  - /home/wallace/test-lustre/docker/lustre-client:/lustre/lustre-client
2

There are 2 answers

0
Niklas On BEST ANSWER

The difference with --privileged and all-capabilities is, that --privileged argument removes all limitations enforced by cgroup controller and disables security enchantments while providing access for all devices. Privileged container truly becomes part of the host operating system, and has access even into AppArmor and SELinux configurations, which might not be applied, such as SELinux labels.

When --privileged flag is used, it does not enforce any extra security for underlying container, and kernel filesystem is not mounted as read-only into container. SECCOMP filtering is disabled as well. Still, you can't get more power than current namespace allows, for example if you are running rootless daemon.

Capabilities are way to adjust the power of root, but still some security enchantments are applied when container is executed.

One great blog post by Red Hat is available in here.

As pointed out in other answer, AppArmor is probably the issue in this case, and by using --security-opt apparmor:unconfined flag when running container, mounting might be possible. However, that should be used only temporally.

1
zsolt On

Have you tried apparmor:unconfined?

version: "3"
services:
  aiart:
    cap_add:
    - SYS_ADMIN
    image: test_lustre:1.1
    security_opt:
    - apparmor:unconfined
    ports:
    - "12345:12345"
    volumes:
      - /home/wallace/test-lustre/docker/lustre-client:/lustre/lustre-client

If this works, then try to write a custom apparmor profile that fits your needs, because unconfined would be less secure I guess: https://docs.docker.com/engine/security/apparmor/