7.2. Docker

7.2.1. Virtualization Technology and Container Technology

7.2.1.1. Traditional Virtualization Technology

Traditional virtualization technology adds a hypervisor layer to virtualize virtual hardware such as network cards, memory, and CPU, and then builds clients on it. Each client has its own system kernel. Traditional virtualization technology uses virtual machines as management units. Each virtual machine has an independent operating system kernel and does not share the software system resources of the host machine. Therefore, it has good isolation and is suitable for multi-tenant scenarios in cloud computing environments.

7.2.1.2. Container Technology

Container technology can be regarded as a lightweight virtualization method. Container technology is virtualized at the operating system layer and can run multiple virtualization environments on the host kernel. Compared with traditional application testing and deployment, the deployment of containers does not need to consider the compatibility of the application’s operating environment in advance; compared with traditional virtual machines, containers can run in the host without an independent operating system kernel, achieving higher operational efficiency and resource utilization.

7.2.2. Docker

Docker is one of the most representative container platforms at present, with advantages such as continuous deployment and testing, and cross-cloud platform support. In the container cloud environment based on container orchestration tools such as Kubernetes, through the scheduling of cross-host cluster resources, the container cloud can provide functions such as resource sharing and isolation, container orchestration and deployment, and application support.

7.2.2.1. Basic Concepts

Docker has three basic concepts, Image, Container, and Repository. An image is a read-only template consisting of a set of file systems using Union FS technology.

Images are static definitions, and containers are running instances created from images. The essence of a container is a process, which has its own independent namespace.

Repository is a place where image files are stored centrally for storing and distributing images.

Containers can be started, started, stopped, deleted, and each container is isolated from each other. You can think of a container as a simple Linux environment (including root user permissions, process space, user space, and network space, etc.) and run applications in it.

7.2.2.2. Composition

The Docker engine consists of the following main components: Docker Client, Docker daemon, containerd, and RunC, which are jointly responsible for creating and running containers.

Docker Client is a client for establishing communication with Docker Daemon. Docker Client can establish communication with Daemon through http/unix socket and other methods.

Docker Daemon is a container-managed daemon. It runs on the host and accepts requests from clients as a server. Its main functions include image management, image construction, REST API, authentication, security, core network, and orchestration. The Docker daemon implements the Docker remote API via /var/run/docker.socka local IPC/Unix socket located at , the default non-TLS network port is 2375, and the TLS default port is 2376.

containerd emerged after the standardization of container technology to strip the container runtime from the Docker Daemon. The main responsibilities of containerd are image management and container execution.

RunC is a specific implementation developed by Docker in accordance with the OCF standard, which implements functions such as container start and stop, resource isolation, etc.

7.2.2.3. Data

Docker data is mainly divided into persistent and non-persistent data. By default, non-persistent storage is automatically created with the same life cycle as the container. Deleting the container will also delete non-persistent data. In the Linux environment, non-persistent data defaults to Stored under the /var/lib/docker/ folder .

7.2.2.4. Network

The Docker network architecture is derived from a solution called the container network model, which is mainly composed of CNM, Libnetwork, and network drivers.

7.2.3. Security Risks and Security Mechanisms

The main points to consider when considering Docker security are the following:

  • The security of the kernel itself and its support for namespaces and cgroups

  • The attack surface of the Docker daemon itself

  • The “hardened” security features of the kernel and how they interact with containers

7.2.3.1. Docker Security Baseline

benchsec

7.2.3.2. Kernel namespace/namespace

Docker containers are very similar to LXC containers and have similar security features. When you start a container with docker run, Docker creates a set of namespaces and control groups for the container in the background.

Namespaces provide one of the most straightforward forms of isolation: a process running in a container cannot see or affect a process running in another container or the host system.

Each container also has its own network stack, which means that one container cannot gain privileged access to another container’s sockets or interfaces. Of course, containers can interact via their respective network interfaces if the host system is set up accordingly. If you specify a public port for the container or use a link, IP communication is allowed between containers.

They can ping each other, send/receive UDP packets, and establish TCP connections, but they can be restricted if desired. From a network architecture perspective, all containers on a given Docker host reside on bridge interfaces. This means they act like physical machines connected through a normal ethernet switch.

7.2.3.3. Control Group

Control groups are another key component of Linux containers, and their main role is to enforce resource accounting and constraints.

Cgroups provide many useful metrics, but also help ensure that each container gets a fair amount of memory, CPU, and disk I/O; more importantly, a single container cannot degrade the performance of the system by exhausting resources.

So, while Cgroups cannot prevent one container from accessing or affecting another container’s data and processes, they are critical to defending against some denial-of-service attacks. They are especially important for multi-tenant platforms, such as public and private PaaS, to guarantee consistent uptime (and performance) even when some applications start misbehaving.

7.2.3.4. Attack Surface of Daemons

Running containers with Docker means running the Docker daemon, which currently requires root privileges, so the daemon is one place to consider.

First, only trusted users are allowed to control the Docker daemon. Specifically, Docker allows you to share a directory between the Docker host and guest containers; it allows you to do so without restricting the container’s access. This means that a container can be started where the /host directory will become the / directory on the host, and the container will be able to alter the host filesystem without any restrictions.

This has strong security implications: if, for example, Docker is being tested through a web server to configure a container via the API, parameter checking should be done more carefully to ensure that a malicious user cannot pass crafted parameters that could cause Docker to create arbitrary containers.

The daemon may also be vulnerable to other inputs, such as loading images from disk with docker load or from network with docker pull.

Ultimately, it is expected that the Docker daemon will run with restricted privileges, delegating operations to well-audited subprocesses, each with their own (very limited) scope of Linux capabilities, virtual network settings, filesystem management, etc. That said, most likely, parts of the Docker engine itself will run in containers.

7.2.3.5. Capability

By default, Docker uses the Capability mechanism to limit the operations of some root users while running containers as root.

In most cases, containers do not need true root privileges. As a result, Docker can run a collection with lower Capability, which means that there are far fewer roots in the container than true root. For example:

  • Deny all mount operations

  • Deny access to raw sockets (to prevent packet spoofing)

  • Deny access to certain filesystem operations, such as creating a new device node, changing a file’s owner, or modifying attributes (including immutable flags)

  • Deny module loading

  • other

This means that even if an intruder gains root privileges inside the container, further attacks will be much more difficult. By default, Docker uses a whitelist instead of a blacklist, removing all non-essential functionality.

7.2.3.6. Seccomp

Docker uses Seccomp to limit the system calls a container makes to the host kernel.

7.2.4. Attack Surface Analysis

7.2.4.1. Supply chain security

In the process of building a Dockerfile, even if the top-ranked sources are used, there may be problems such as CVE vulnerabilities, backdoors, image contamination, and vulnerabilities in the dependency library in the image.

7.2.4.2. Virtualization Risks

Although Docker implements basic isolation of file system resources through namespaces, there are still important system file directories and namespace information such as /sys, /proc/sys, /proc/bus, /dev, time, syslogetc., which are not isolated, but share related resources with the host.

7.2.4.3. Escape with Kernel Vulnerability

  • CVE-2016-5195

7.2.4.4. Container Escape Vulnerability

  • CVE-2021-41091

  • CVE-2019-14271 Docker cp

  • CVE-2019-13139 Docker build code execution

  • CVE-2019-5736 runC
    • Docker Version < 18.09.2

    • Version <= 1.0-rc6

  • CVE-2018-18955

7.2.4.5. Improper configuration

  • Enable privileged

  • Mount the host sensitive directory

  • Improperly configured cap
    • --cap-add=SYS_ADMIN

  • Bypass namespace
    • --net=host

    • --pid=host

    • --ipc=host

7.2.4.6. Denial of Service

  • CPU exhausted

  • memory exhausted

  • storage exhausted

  • network resources exhausted

7.2.4.7. Dangerous mount

  • mount /var/run/docker.sock

  • Mount the host /dev /proc and other dangerous directories

7.2.4.8. Attacking the Docker daemon

While Docker containers have strong security protections, the Docker daemon itself is not fully protected. The Docker daemon itself is run by the root user by default, and the process itself is not protected with security modules such as Seccomp or AppArmor. This makes it possible for an attacker to successfully obtain the root privilege of the host without being hindered by various security mechanisms once the attacker successfully finds a vulnerability to control the Docker daemon to write arbitrary files or execute code. It is worth mentioning that Docker does not enable User Namespace isolation by default, which also means that the root inside Docker and the root of the host have the same read and write permissions to files. This means that once the root process inside the container gets the chance to read and write host files, file permissions won’t be another issue. This is reflected in the CVE-2019-5636 exploit.

7.2.4.9. Other CVE

  • CVE-2014-5277

  • CVE-2014-6408

  • CVE-2014-9357

  • CVE-2014-9358

  • CVE-2015-3627

  • CVE-2015-3630

7.2.5. Security Hardening

  • minimal installation
    • Remove all development tools (compilers, etc.)

  • Update system sources

  • Enable AppArmor

  • Enable SELinux

  • Restrict kernel capabilities for running containers

  • Remove build dependencies

  • Configure strict network access control policies

  • Start docker without root user

  • Do not run containers in privileged mode

  • control resources
    • CPU Share

    • Number of CPU cores

    • memory resource

    • IO resource

    • Disk resource

    • Hardware resource

    • The maximum number of processes per unit time

  • Use a secure base image

  • Regular security scans and patch updates

  • Remove the setuid and setgid permissions in the image
    • RUN find / -perm +6000-type f-exec chmod a-s {} \;|| true

  • Configure TLS authentication for the Docker daemon

  • Disabling inter-container communication if not necessary

  • rootless Docker
  • Limit syscalls with Seccomp

  • Separate build environment and online environment

  • Certificate verification

7.2.6. Docker environment identification

7.2.6.1. Inside Docker

  • The MAC address is 02:42:ac:11:00:00 - 02:42:ac:11:ff:ff

  • ps aux Most running programs have small pids

  • cat /proc/1/cgroup docker的进docker process

  • exists in the docker environment .dockerenv

  • Many commonly used commands such as ping etc

7.2.6.2. Outside Docker

  • /var/run/docker.sock file exists

  • 2375 / 2376 port open