Docker Security Hardening: Rootless Mode, Seccomp, AppArmor on a VPS

14 min read·Matthieu·VPSContainersAppArmorSeccompRootless DockerSecurityDocker|

Seven hardening layers for Docker on a VPS. Each section explains the threat, shows the fix with CLI and Compose syntax, and verifies it works.

Docker's default configuration trades security for convenience. Containers run as root on the host. All 14 Linux capabilities stay active. Seccomp blocks only 44 of 300+ syscalls. Inter-container traffic flows freely.

On a VPS, this matters more than on a local dev machine. You share a physical host with other tenants. A container escape means an attacker lands as root on the hypervisor-facing kernel. Every hardening layer you add reduces the blast radius.

This tutorial covers seven hardening measures. Each section explains the threat it prevents, shows the implementation (both docker run flags and Compose syntax), and includes a verification step. We tested every command on Ubuntu 24.04 running Docker Engine 29.x.

Prerequisites: A VPS running Debian 12 or Ubuntu 24.04 with Docker Engine installed. SSH access as a non-root sudo user. If you haven't locked down the host itself yet, start with our Linux VPS Security: Threats, Layers, and Hardening Guide first. For Docker firewall issues, see Fix Docker Bypassing UFW: 4 Tested Solutions for Your VPS.

This article is part of the Docker in Production on a VPS: What Breaks and How to Fix It series.

How do I set up rootless Docker on Ubuntu 24.04 or Debian 12?

Rootless Docker runs the daemon and all containers under a regular user account instead of root. If an attacker escapes a container, they land as an unprivileged user on the host. No root access. Of all the measures in this guide, this one has the largest impact.

Install rootless Docker

Install the required packages. The uidmap package provides newuidmap and newgidmap, which handle subordinate UID/GID mapping:

sudo apt-get update && sudo apt-get install -y uidmap docker-ce-rootless-extras

Verify your user has at least 65,536 subordinate UIDs and GIDs:

grep "^$(whoami):" /etc/subuid
grep "^$(whoami):" /etc/subgid

The output shows entries like deploy:100000:65536. If the entries are missing, add them:

sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 $(whoami)

Stop the system-wide Docker daemon. You don't need it for rootless mode:

sudo systemctl disable --now docker.service docker.socket

Run the rootless setup tool as your regular user (not root):

dockerd-rootless-setuptool.sh install

The script prints environment variables you need to set. Add them to your shell profile:

echo 'export PATH=/usr/bin:$PATH' >> ~/.bashrc
echo 'export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock' >> ~/.bashrc
source ~/.bashrc

Enable linger so the rootless daemon starts at boot, not just when you log in:

sudo loginctl enable-linger $(whoami)

Verify rootless mode

docker context use rootless
docker run --rm hello-world

Check that the daemon process runs as your user, not root:

ps aux | grep dockerd

The output should show your username, not root. Also confirm the Docker info reports rootless:

docker info --format '{{.SecurityOptions}}'

The output includes rootless in the list.

When rootless Docker breaks things

Rootless mode has real limitations. Knowing them prevents hours of debugging.

Limitation Cause Workaround
Cannot bind to ports below 1024 Non-root users cannot bind privileged ports Set sysctl net.ipv4.ip_unprivileged_port_start=0 or use a reverse proxy on the host
Bind mount permission errors Host files owned by root are inaccessible to the remapped UID Change ownership to your user, or use named volumes
Slower overlay filesystem Rootless uses fuse-overlayfs instead of native overlay2 Accept the overhead (5-15% for I/O heavy workloads) or use native overlay2 with --privileged (defeats the purpose)
No --net=host Rootless networking uses slirp4netns or pasta, not the host network stack Use port mapping (-p) instead. For better performance, install pasta as the network driver
ping fails inside containers CAP_NET_RAW is restricted Install slirp4netns >= 0.4.0 or use pasta

Networking performance note: By default, rootless Docker uses slirp4netns for networking, which adds NAT overhead. The pasta driver copies the host network config into the container namespace without NAT and offers better throughput. On Debian 12 and Ubuntu 24.04, install it with:

sudo apt-get install -y passt

Docker picks up pasta automatically if it's installed.

When should I use user namespace remapping instead of rootless Docker?

User namespace remapping (userns-remap) maps UID 0 inside containers to an unprivileged UID on the host. Unlike rootless Docker, the daemon itself still runs as root. This means you keep full Docker functionality (privileged ports, host networking, native overlay2) while still preventing container-root-equals-host-root.

Choose userns-remap when rootless mode breaks your workload but you still want UID isolation. Choose rootless when you can live with its limitations.

Feature Rootless Docker userns-remap Standard Docker
Daemon runs as User Root Root
Container root = host root No No Yes
Privileged ports Workaround needed Works Works
--net=host No Yes Yes
Storage driver fuse-overlayfs overlay2 overlay2
Setup complexity Medium Low None

Configure userns-remap

Create the dockremap user or use the default shortcut that creates it automatically:

sudo tee /etc/docker/daemon.json > /dev/null <<'EOF'
{
  "userns-remap": "default"
}
EOF

sudo systemctl restart docker

Docker creates the dockremap user and adds subordinate UID/GID ranges to /etc/subuid and /etc/subgid.

Verify it's working:

sudo ls -ld /var/lib/docker/

A new subdirectory appears, named after the remapped UID range, such as /var/lib/docker/100000.100000/. The exact number depends on the subordinate UID range assigned to the dockremap user in /etc/subuid.

Run a container and check the process UID on the host:

docker run -d --name test-userns nginx:alpine
ps aux | grep nginx

The nginx process should show a high UID (matching the first number from /etc/subuid for dockremap), not 0.

Clean up:

docker rm -f test-userns

Volume ownership gotcha: Bind-mounted files owned by host root (UID 0) appear as nobody inside the container because UID 0 maps to a different range. Use named volumes or chown files to the remapped UID.

How do I drop Linux capabilities from a Docker container?

Docker grants 14 Linux capabilities to containers by default. Each capability is a kernel permission that an attacker can abuse after a container escape or within a compromised container. Dropping all capabilities and adding back only what your application needs shrinks the attack surface.

Default Docker capabilities

Capability What it allows Keep or drop?
CAP_CHOWN Change file ownership Drop unless needed
CAP_DAC_OVERRIDE Bypass file read/write permission checks Drop unless needed
CAP_FOWNER Bypass permission checks on file owner Drop unless needed
CAP_FSETID Set setuid/setgid bits Drop
CAP_KILL Send signals to any process Drop unless needed
CAP_SETGID Change process GID Keep for most apps
CAP_SETUID Change process UID Keep for most apps
CAP_SETPCAP Modify process capabilities Drop
CAP_NET_BIND_SERVICE Bind to ports below 1024 Keep if binding port 80/443
CAP_NET_RAW Use raw sockets (craft packets) Drop unless you need ping/traceroute
CAP_SYS_CHROOT Use chroot Drop
CAP_MKNOD Create device files Drop
CAP_AUDIT_WRITE Write to kernel audit log Drop unless needed
CAP_SETFCAP Set file capabilities Drop

Drop all, add back what you need

CLI syntax:

docker run -d \
  --cap-drop ALL \
  --cap-add CHOWN \
  --cap-add NET_BIND_SERVICE \
  --cap-add SETUID \
  --cap-add SETGID \
  --name hardened-nginx \
  nginx:alpine

Nginx needs CHOWN because its entrypoint script changes ownership of cache directories on startup. Without it, the container exits immediately with a chown: Operation not permitted error.

Compose syntax:

services:
  web:
    image: nginx:alpine
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - NET_BIND_SERVICE
      - SETUID
      - SETGID

Verify capabilities are dropped

docker exec hardened-nginx sh -c 'cat /proc/1/status | grep Cap'

Compare the CapEff (effective capabilities) bitmask. With all dropped and four added back, the value is much lower than the default 00000000a80425fb.

For a readable output, install capsh on the host and decode the hex:

docker exec hardened-nginx sh -c 'cat /proc/1/status | grep CapEff' | awk '{print $2}' | xargs -I{} capsh --decode=0x{}

The output lists only cap_chown,cap_setgid,cap_setuid,cap_net_bind_service.

Clean up:

docker rm -f hardened-nginx

What does the no-new-privileges flag prevent?

The no-new-privileges flag blocks processes inside a container from gaining additional privileges through setuid or setgid binaries. Without this flag, a compromised process can execute a setuid binary (like su or sudo) and escalate to root. With the flag set, the kernel refuses the privilege escalation.

Apply per container

CLI:

docker run -d --security-opt no-new-privileges:true --name no-priv-test nginx:alpine

Compose:

services:
  web:
    image: nginx:alpine
    security_opt:
      - no-new-privileges:true

Apply as daemon default

Add it to daemon.json so every container gets this flag automatically:

{
  "no-new-privileges": true
}

Restart Docker after editing:

sudo systemctl restart docker

Verify it works

docker exec no-priv-test grep NoNewPrivs /proc/1/status

Expected output:

NoNewPrivs:	1

A value of 1 means no new privileges can be gained. A value of 0 means the flag is not set.

Clean up:

docker rm -f no-priv-test

How do I create a custom seccomp profile for Docker?

Docker's default seccomp profile blocks about 44 syscalls out of 300+. A custom profile lets you restrict containers to only the syscalls your application actually uses. If an attacker compromises the container, they cannot exploit kernel vulnerabilities through blocked syscalls.

Discover which syscalls your application needs

Use strace on a running container to capture the syscalls it makes during normal operation:

docker run -d --security-opt seccomp=unconfined --name trace-target nginx:alpine

# Install strace on the host
sudo apt-get install -y strace

# Get the container's PID
PID=$(docker inspect --format '{{.State.Pid}}' trace-target)

# Trace syscalls for 30 seconds during normal operation
sudo strace -f -p "$PID" -o /tmp/nginx-syscalls.log -e trace=all &
STRACE_PID=$!
sleep 30

# Send some test traffic to exercise the application
curl -s http://localhost:80 > /dev/null 2>&1 || true

kill "$STRACE_PID" 2>/dev/null
wait "$STRACE_PID" 2>/dev/null

Extract the unique syscall names:

grep -oP '^\[pid \d+\] \K\w+|^\w+' /tmp/nginx-syscalls.log | sort -u

This gives you the minimum set of syscalls your application needs.

Build the custom profile

Create a JSON file. The defaultAction is SCMP_ACT_ERRNO (deny everything not explicitly allowed). The syscall list must include both what your application needs AND what the container runtime (runc) needs during initialization. The profile below was tested with nginx:alpine on Docker Engine 29.x:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": [
    "SCMP_ARCH_X86_64",
    "SCMP_ARCH_X86",
    "SCMP_ARCH_AARCH64"
  ],
  "syscalls": [
    {
      "names": [
        "accept4",
        "access",
        "arch_prctl",
        "bind",
        "brk",
        "capget",
        "capset",
        "chdir",
        "chown",
        "clone",
        "clone3",
        "close",
        "close_range",
        "connect",
        "copy_file_range",
        "dup",
        "dup2",
        "dup3",
        "epoll_create1",
        "epoll_ctl",
        "epoll_pwait",
        "epoll_pwait2",
        "epoll_wait",
        "eventfd2",
        "execve",
        "exit",
        "exit_group",
        "faccessat",
        "faccessat2",
        "fchmod",
        "fchmodat",
        "fchown",
        "fchownat",
        "fcntl",
        "fork",
        "fstat",
        "fstatfs",
        "futex",
        "getcwd",
        "getdents",
        "getdents64",
        "getegid",
        "geteuid",
        "getgid",
        "getpgrp",
        "getpid",
        "getppid",
        "getrandom",
        "getrlimit",
        "getsockname",
        "getsockopt",
        "gettid",
        "getuid",
        "io_destroy",
        "io_getevents",
        "io_setup",
        "io_submit",
        "ioctl",
        "kill",
        "listen",
        "lseek",
        "lstat",
        "madvise",
        "memfd_create",
        "mkdir",
        "mkdirat",
        "mmap",
        "mount",
        "mprotect",
        "mremap",
        "munmap",
        "nanosleep",
        "newfstatat",
        "open",
        "openat",
        "pipe",
        "pipe2",
        "pivot_root",
        "poll",
        "ppoll",
        "prctl",
        "pread64",
        "prlimit64",
        "pwrite64",
        "read",
        "readlink",
        "readlinkat",
        "recvfrom",
        "recvmsg",
        "rename",
        "renameat",
        "rseq",
        "rt_sigaction",
        "rt_sigprocmask",
        "rt_sigreturn",
        "sched_getaffinity",
        "sched_yield",
        "seccomp",
        "sendfile",
        "sendmsg",
        "sendto",
        "set_robust_list",
        "set_tid_address",
        "setgid",
        "setgroups",
        "sethostname",
        "setitimer",
        "setsockopt",
        "setuid",
        "sigaltstack",
        "socket",
        "socketpair",
        "stat",
        "statfs",
        "statx",
        "symlink",
        "symlinkat",
        "sysinfo",
        "tgkill",
        "umask",
        "umount2",
        "uname",
        "unlink",
        "unlinkat",
        "unshare",
        "utimensat",
        "wait4",
        "write",
        "writev"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

Why so many syscalls? The list includes syscalls needed by three layers: the container runtime (runc) for namespace setup (clone3, mount, pivot_root, unshare, seccomp, statx), the Alpine shell and entrypoint scripts (fork, open, pipe, wait4), and nginx itself (accept4, bind, listen, io_setup). A glibc-based image would need fewer shell-related syscalls but more libc-internal ones.

Save this as /etc/docker/seccomp-nginx.json.

Apply the custom profile

CLI:

docker run -d \
  --security-opt seccomp=/etc/docker/seccomp-nginx.json \
  --name seccomp-test \
  nginx:alpine

Compose:

services:
  web:
    image: nginx:alpine
    security_opt:
      - seccomp=/etc/docker/seccomp-nginx.json

Verify the profile is active

docker inspect --format '{{.HostConfig.SecurityOpt}}' seccomp-test

The output shows the full seccomp profile JSON that was applied to the container. Docker reads the file at container creation time and embeds the profile contents.

Test that a restricted operation fails:

docker exec seccomp-test unshare --mount /bin/sh -c 'echo escaped'

This should fail with "Operation not permitted". The container lacks CAP_SYS_ADMIN (not granted by default), and the seccomp profile provides a second layer of defense by allowlisting only the syscalls needed for normal operation.

Clean up:

docker rm -f seccomp-test
rm -f /tmp/nginx-syscalls.log

Production tip: Start with Docker's default profile and remove syscalls rather than building from scratch. The strace approach gives you the tightest possible profile but needs thorough testing. Exercise every code path your application uses during the strace capture: startup, normal requests, error handling, graceful shutdown (docker stop), and log rotation.

Iterative approach: If building from strace feels risky, use this safer workflow:

  1. Copy Docker's default seccomp profile as your starting point.
  2. Run your container with this copied profile. It behaves identically to the default.
  3. Remove syscalls one group at a time (e.g., all key* syscalls, all swap* syscalls).
  4. Test the container after each removal. If it breaks, add the last removed syscall back.
  5. Repeat until you've trimmed everything your application doesn't need.

This is slower than the strace method but safer for production containers where missing a syscall could cause intermittent failures.

How do I write an AppArmor profile for a Docker container?

AppArmor restricts what files, network resources, and capabilities a container process can access. Docker applies a default docker-default profile automatically. A custom profile lets you further restrict containers to only the filesystem paths and network operations they need.

Check AppArmor is active

sudo aa-status

The output includes docker-default in the loaded profiles. If AppArmor is not installed:

sudo apt-get install -y apparmor apparmor-utils

Write a custom profile

Create /etc/apparmor.d/containers/docker-nginx:

#include <tunables/global>

profile docker-nginx flags=(attach_disconnected,mediate_deleted) {
  #include <abstractions/base>
  #include <abstractions/nameservice>

  # Capabilities needed by nginx
  capability chown,
  capability setuid,
  capability setgid,
  capability net_bind_service,
  capability dac_override,

  # Network access
  network inet tcp,
  network inet udp,
  network inet6 tcp,
  network inet6 udp,

  # Shell and entrypoint scripts
  /bin/** rix,
  /usr/bin/** rix,
  /usr/sbin/** rix,
  /lib/** mr,
  /usr/lib/** mr,
  /docker-entrypoint.sh rix,
  /docker-entrypoint.d/ r,
  /docker-entrypoint.d/** rix,
  /dev/null rw,
  /dev/stdout rw,
  /dev/stderr rw,

  # Nginx binary
  /usr/sbin/nginx ix,

  # Config files (read only)
  /etc/ r,
  /etc/nginx/ r,
  /etc/nginx/** r,

  # Web root (read only)
  /usr/share/nginx/html/** r,

  # Temp and cache directories
  /var/cache/nginx/ rw,
  /var/cache/nginx/** rw,
  /var/log/nginx/ rw,
  /var/log/nginx/** rw,
  /run/ rw,
  /run/** rw,
  /tmp/ rw,
  /tmp/** rw,

  # Proc filesystem (needed for nginx worker management)
  /proc/** r,

  # Deny sensitive files
  deny /etc/shadow r,
  deny /etc/passwd w,
  deny /proc/*/mem r,
  deny /sys/** w,
}

The profile needs capability declarations because AppArmor controls capability use independently from Docker's --cap-add flags. The entrypoint script paths (/docker-entrypoint.sh, /docker-entrypoint.d/) and shell binaries (/bin/**) must be explicitly allowed, or the container fails to start. The rix permission means read, inherit execution context, and allow execution.

Load and apply the profile

sudo apparmor_parser -r -W /etc/apparmor.d/containers/docker-nginx

Verify it loaded:

sudo aa-status | grep docker-nginx

Run a container with the custom profile:

CLI:

docker run -d \
  --security-opt apparmor=docker-nginx \
  --name apparmor-test \
  nginx:alpine

Compose:

services:
  web:
    image: nginx:alpine
    security_opt:
      - apparmor=docker-nginx

Verify AppArmor enforcement

docker exec apparmor-test cat /etc/shadow

This should return "Permission denied" because the profile explicitly denies reading /etc/shadow.

Check the container's AppArmor status:

docker inspect --format '{{.AppArmorProfile}}' apparmor-test

Expected output: docker-nginx.

Clean up:

docker rm -f apparmor-test

How do I run a Docker container with a read-only filesystem?

A read-only root filesystem prevents attackers from writing malware, backdoors, or modified binaries inside a compromised container. The container can still write to explicitly mounted tmpfs volumes for temporary files and runtime data.

CLI:

docker run -d \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=64m \
  --tmpfs /run:rw,noexec,nosuid,size=16m \
  --tmpfs /var/cache/nginx:rw,noexec,nosuid,size=128m \
  -p 8080:80 \
  --name readonly-nginx \
  nginx:alpine

Compose:

services:
  web:
    image: nginx:alpine
    read_only: true
    tmpfs:
      - /tmp:rw,noexec,nosuid,size=64m
      - /run:rw,noexec,nosuid,size=16m
      - /var/cache/nginx:rw,noexec,nosuid,size=128m

The noexec flag on tmpfs mounts prevents executing binaries from temp directories, a common technique attackers use after gaining write access.

Verify the filesystem is read-only

docker exec readonly-nginx touch /testfile

Expected output:

touch: /testfile: Read-only file system

Confirm tmpfs mounts work:

docker exec readonly-nginx touch /tmp/testfile && echo "tmpfs works"

Verify the container is serving traffic:

curl -s -o /dev/null -w '%{http_code}' http://localhost:8080

Expected: 200.

Clean up:

docker rm -f readonly-nginx

What should a hardened Docker daemon.json look like?

A hardened daemon.json applies security defaults to every container on the host. Individual containers can still override some settings, but the daemon config sets the baseline.

Create or edit /etc/docker/daemon.json:

{
  "no-new-privileges": true,
  "icc": false,
  "live-restore": true,
  "userland-proxy": false,
  "log-driver": "journald",
  "log-opts": {
    "tag": "{{.Name}}"
  },
  "default-ulimits": {
    "nproc": {
      "Name": "nproc",
      "Hard": 512,
      "Soft": 256
    },
    "nofile": {
      "Name": "nofile",
      "Hard": 65536,
      "Soft": 32768
    }
  },
  "storage-driver": "overlay2"
}

What each setting does:

  • no-new-privileges: Blocks setuid/setgid escalation in all containers by default.
  • icc: false: Disables inter-container communication on the default bridge network. Containers can only reach each other through explicitly published ports or user-defined networks. This limits lateral movement if one container is compromised.
  • live-restore: Keeps containers running during daemon restarts. Prevents downtime during Docker upgrades.
  • userland-proxy: false: Uses iptables for port mapping instead of a userland proxy process. Better performance, fewer open file descriptors.
  • log-driver: journald: Sends container logs to the system journal where they're centrally managed and rotated.
  • default-ulimits: Limits processes and open files per container. Prevents fork bombs and file descriptor exhaustion.

Apply the changes:

sudo systemctl restart docker

Verify the daemon picked up the config:

docker info --format '{{.SecurityOptions}}'

The output includes no-new-privileges in the security options.

Check that ICC is disabled:

docker network inspect bridge --format '{{index .Options "com.docker.network.bridge.enable_icc"}}'

Expected output: false.

Note on userns-remap: If you chose user namespace remapping over rootless mode, add "userns-remap": "default" to this config. Do not combine userns-remap with rootless Docker.

Version hiding: While you're editing daemon.json, also consider hiding Docker API version info. Docker doesn't expose version headers by default like Nginx does, but if you run the Docker API on a TCP socket (don't, unless you absolutely must), protect it with TLS client certificates. An exposed Docker API is equivalent to root shell access.

Auditing your setup: Run Docker Bench for Security to audit your configuration against the CIS Docker Benchmark. Clone the repository and run the script directly, as the container image ships with an outdated Docker client that is incompatible with Docker Engine 29.x:

git clone https://github.com/docker/docker-bench-security.git /tmp/docker-bench
cd /tmp/docker-bench && sudo bash docker-bench-security.sh

Review the output for WARN entries. The hardened daemon.json above addresses most of them. Items flagged as INFO are recommendations, not failures.

Combining multiple hardening layers in Docker Compose

A production Compose file should stack several hardening measures together. Here is an example for an Nginx container with all seven layers applied:

services:
  web:
    image: nginx:alpine
    read_only: true
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - NET_BIND_SERVICE
      - SETUID
      - SETGID
    security_opt:
      - no-new-privileges:true
      - seccomp=/etc/docker/seccomp-nginx.json
      - apparmor=docker-nginx
    tmpfs:
      - /tmp:rw,noexec,nosuid,size=64m
      - /run:rw,noexec,nosuid,size=16m
      - /var/cache/nginx:rw,noexec,nosuid,size=128m
    ports:
      - "80:80"
    pids_limit: 100
    deploy:
      resources:
        limits:
          memory: 256M
          cpus: '1.0'

The pids_limit setting prevents fork bombs and deploy.resources.limits caps memory and CPU. These are not security-opt flags but they prevent denial-of-service from within the container. For more on resource limits, see Docker Compose Resource Limits, Healthchecks, and Restart Policies.

Verify all security options are active on the running container:

docker compose up -d
docker inspect web --format '{{json .HostConfig.SecurityOpt}}' | python3 -m json.tool

The output lists no-new-privileges, your seccomp profile path, and the AppArmor profile name.

Production hardening checklist

Measure CLI flag Compose key daemon.json Threat prevented
Rootless Docker N/A (daemon-level) N/A N/A (separate daemon) Container escape gives host root
userns-remap N/A N/A "userns-remap": "default" Container root = host root
Drop capabilities --cap-drop ALL --cap-add X cap_drop: [ALL] N/A Kernel attack surface
no-new-privileges --security-opt no-new-privileges:true security_opt: [no-new-privileges:true] "no-new-privileges": true Setuid/setgid escalation
Custom seccomp --security-opt seccomp=profile.json security_opt: [seccomp:path] N/A Kernel exploit via blocked syscalls
AppArmor profile --security-opt apparmor=name security_opt: [apparmor:name] N/A File/network access beyond app needs
Read-only rootfs --read-only read_only: true N/A Persistent malware, binary tampering
Disable ICC N/A N/A "icc": false Lateral movement between containers
Limit processes --pids-limit 100 pids_limit: 100 "default-pids-limit": 100 Fork bombs
Disable userland proxy N/A N/A "userland-proxy": false Resource waste, reduces attack surface

Troubleshooting

Container fails to start after adding seccomp profile: Your profile is missing a syscall the application needs. Run with --security-opt seccomp=unconfined temporarily, strace the process, and add the missing syscalls to your profile.

Check kernel audit logs for blocked syscalls:

sudo journalctl -k | grep -i seccomp

AppArmor blocks legitimate operations: Set the profile to complain mode to log denials without enforcing them:

sudo aa-complain /etc/apparmor.d/containers/docker-nginx

Watch the logs:

sudo journalctl -k | grep apparmor | tail -20

After adding the required permissions, switch back to enforce mode:

sudo aa-enforce /etc/apparmor.d/containers/docker-nginx

Rootless Docker cannot pull images: Check that DOCKER_HOST is set correctly:

echo $DOCKER_HOST

It should point to /run/user/<UID>/docker.sock. Also verify the rootless daemon is running:

systemctl --user status docker

Bind mount permission denied with userns-remap: Files on the host owned by root (UID 0) are inaccessible because the container's UID 0 maps to a high host UID. Fix by changing ownership:

# Find the remapped UID
grep dockremap /etc/subuid
# Then chown to that UID
sudo chown -R <remapped-uid>:<remapped-uid> /path/to/bind/mount

Replace <remapped-uid> with the first number from /etc/subuid for the dockremap user (commonly 100000).

Container logs: For any Docker-related issue, start with the container logs and the Docker daemon journal:

docker logs <container-name>
sudo journalctl -u docker -f

For rootless Docker, check the user-level journal:

journalctl --user -u docker -f