Docker and OverlayFS in practice


https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/


OverlayFS is a modern union filesystem that is similar to AUFS. In comparison to AUFS, OverlayFS:

  • has a simpler design
  • has been in the mainline Linux kernel since version 3.18
  • is potentially faster

As a result, OverlayFS is rapidly gaining popularity in the Docker communityand is seen by many as a natural successor to AUFS. As promising as OverlayFSis, it is still relatively young. Therefore caution should be taken beforeusing it in production Docker environments.

Docker’s overlay storage driver leverages several OverlayFS features to build and manage the on-disk structures of images and containers.

Note: Since it was merged into the mainline kernel, the OverlayFS kernelmodule was renamed from “overlayfs” to “overlay”. As a result you may see thetwo terms used interchangeably in some documentation. However, this documentuses “OverlayFS” to refer to the overall filesystem, and overlay to referto Docker’s storage-driver.

Image layering and sharing with OverlayFS

OverlayFS takes two directories on a single Linux host, layers one on top ofthe other, and provides a single unified view. These directories are oftenreferred to as layers and the technology used to layer them is known as aunion mount. The OverlayFS terminology is “lowerdir” for the bottom layer and “upperdir” for the top layer. The unified view is exposed through its owndirectory called “merged”.

The diagram below shows how a Docker image and a Docker container are layered.The image layer is the “lowerdir” and the container layer is the “upperdir”.The unified view is exposed through a directory called “merged” which iseffectively the containers mount point. The diagram shows how Docker constructs map to OverlayFS constructs.

Notice how the image layer and container layer can contain the same files. When this happens, the files in the container layer (“upperdir”) are dominant andobscure the existence of the same files in the image layer (“lowerdir”). Thecontainer mount (“merged”) presents the unified view.

OverlayFS only works with two layers. This means that multi-layered imagescannot be implemented as multiple OverlayFS layers. Instead, each image layeris implemented as its own directory under /var/lib/docker/overlay.Hard links are then used as a space-efficient way to reference data shared with lower layers. As of Docker 1.10, image layer IDs no longer correspond todirectory names in /var/lib/docker/

To create a container, the overlay driver combines the directory representing the image’s top layer plus a new directory for the container. The image’s toplayer is the “lowerdir” in the overlay and read-only. The new directory for the container is the “upperdir” and is writable.

Example: Image and container on-disk constructs

The following docker pull command shows a Docker host with downloading aDocker image comprising four layers.

$ sudo docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
8387d9ff0016: Pull complete
3b52deaaf0ed: Pull complete
4bd501fad6de: Pull complete
a3ed95caeb02: Pull complete
Digest: sha256:457b05828bdb5dcc044d93d042863fba3f2158ae249a6db5ae3934307c757c54
Status: Downloaded newer image for ubuntu:latest

Each image layer has it’s own directory under /var/lib/docker/overlay/. Thisis where the contents of each image layer are stored.

The output of the command below shows the four directories that store thecontents of each image layer just pulled. However, as can be seen, the imagelayer IDs do not match the directory names in /var/lib/docker/overlay. Thisis normal behavior in Docker 1.10 and later.

$ ls -l /var/lib/docker/overlay/
total 24
drwx------ 3 root root 4096 Oct 28 11:02 1d073211c498fd5022699b46a936b4e4bdacb04f637ad64d3475f558783f5c3e
drwx------ 3 root root 4096 Oct 28 11:02 5a4526e952f0aa24f3fcc1b6971f7744eb5465d572a48d47c492cb6bbf9cbcda
drwx------ 5 root root 4096 Oct 28 11:06 99fcaefe76ef1aa4077b90a413af57fd17d19dce4e50d7964a273aae67055235
drwx------ 3 root root 4096 Oct 28 11:01 c63fb41c2213f511f12f294dd729b9903a64d88f098c20d2350905ac1fdbcbba

The image layer directories contain the files unique to that layer as well ashard links to the data that is shared with lower layers. This allows forefficient use of disk space.

Containers also exist on-disk in the Docker host’s filesystem under/var/lib/docker/overlay/. If you inspect the directory relating to a runningcontainer using the ls -l command, you find the following file anddirectories.

$ ls -l /var/lib/docker/overlay/<directory-of-running-container>
total 16
-rw-r--r-- 1 root root   64 Oct 28 11:06 lower-id
drwxr-xr-x 1 root root 4096 Oct 28 11:06 merged
drwxr-xr-x 4 root root 4096 Oct 28 11:06 upper
drwx------ 3 root root 4096 Oct 28 11:06 work

These four filesystem objects are all artifacts of OverlayFS. The “lower-id”file contains the ID of the top layer of the image the container is based on.This is used by OverlayFS as the “lowerdir”.

$ cat /var/lib/docker/overlay/73de7176c223a6c82fd46c48c5f152f2c8a7e49ecb795a7197c3bb795c4d879e/lower-id
1d073211c498fd5022699b46a936b4e4bdacb04f637ad64d3475f558783f5c3e

The “upper” directory is the containers read-write layer. Any changes made tothe container are written to this directory.

The “merged” directory is effectively the containers mount point. This is where the unified view of the image (“lowerdir”) and container (“upperdir”) isexposed. Any changes written to the container are immediately reflected in this directory.

The “work” directory is required for OverlayFS to function. It is used forthings such as copy_up operations.

You can verify all of these constructs from the output of the mount command.(Ellipses and line breaks are used in the output below to enhance readability.)

$ mount | grep overlay
overlay on /var/lib/docker/overlay/73de7176c223.../merged
type overlay (rw,relatime,lowerdir=/var/lib/docker/overlay/1d073211c498.../root,
upperdir=/var/lib/docker/overlay/73de7176c223.../upper,
workdir=/var/lib/docker/overlay/73de7176c223.../work)

The output reflects that the overlay is mounted as read-write (“rw”).

Container reads and writes with overlay

Consider three scenarios where a container opens a file for read access withoverlay.

  • The file does not exist in the container layer. If a container opens afile for read access and the file does not already exist in the container(“upperdir”) it is read from the image (“lowerdir”). This should incur verylittle performance overhead.

  • The file only exists in the container layer. If a container opens a filefor read access and the file exists in the container (“upperdir”) and not inthe image (“lowerdir”), it is read directly from the container.

  • The file exists in the container layer and the image layer. If acontainer opens a file for read access and the file exists in the image layerand the container layer, the file’s version in the container layer is read.This is because files in the container layer (“upperdir”) obscure files withthe same name in the image layer (“lowerdir”).

Consider some scenarios where files in a container are modified.

  • Writing to a file for the first time. The first time a container writesto an existing file, that file does not exist in the container (“upperdir”).The overlay driver performs a copy_up operation to copy the file from theimage (“lowerdir”) to the container (“upperdir”). The container then writes thechanges to the new copy of the file in the container layer.

    However, OverlayFS works at the file level not the block level. This meansthat all OverlayFS copy-up operations copy entire files, even if the file isvery large and only a small part of it is being modified. This can have anoticeable impact on container write performance. However, two things areworth noting:

    • The copy_up operation only occurs the first time any given file iswritten to. Subsequent writes to the same file will operate against the copy ofthe file already copied up to the container.

    • OverlayFS only works with two layers. This means that performance shouldbe better than AUFS which can suffer noticeable latencies when searching forfiles in images with many layers.

  • Deleting files and directories. When files are deleted within a containera whiteout file is created in the containers “upperdir”. The version of thefile in the image layer (“lowerdir”) is not deleted. However, the whiteout filein the container obscures it.

    Deleting a directory in a container results in opaque directory beingcreated in the “upperdir”. This has the same effect as a whiteout file andeffectively masks the existence of the directory in the image’s “lowerdir”.

Configure Docker with the overlay storage driver

To configure Docker to use the overlay storage driver your Docker host must berunning version 3.18 of the Linux kernel (preferably newer) with the overlaykernel module loaded. OverlayFS can operate on top of most supported Linuxfilesystems. However, ext4 is currently recommended for use in productionenvironments.

The following procedure shows you how to configure your Docker host to useOverlayFS. The procedure assumes that the Docker daemon is in a stopped state.

Caution: If you have already run the Docker daemon on your Docker hostand have images you want to keep, push them Docker Hub or your privateDocker Trusted Registry before attempting this procedure.

  1. If it is running, stop the Docker daemon.

  2. Verify your kernel version and that the overlay kernel module is loaded.

    $ uname -r
    3.19.0-21-generic
    
    $ lsmod | grep overlay
    overlay
    
  3. Start the Docker daemon with the overlay storage driver.

    $ docker daemon --storage-driver=overlay &
    [1] 29403
    root@ip-10-0-0-174:/home/ubuntu# INFO[0000] Listening for HTTP on unix (/var/run/docker.sock)
    INFO[0000] Option DefaultDriver: bridge
    INFO[0000] Option DefaultNetwork: bridge
    <output truncated>
    

    Alternatively, you can force the Docker daemon to automatically start withthe overlay driver by editing the Docker config file and adding the--storage-driver=overlay flag to the DOCKER_OPTS line. Once this optionis set you can start the daemon using normal startup scripts without havingto manually pass in the --storage-driver flag.

  4. Verify that the daemon is using the overlay storage driver

    $ docker info
    Containers: 0
    Images: 0
    Storage Driver: overlay
     Backing Filesystem: extfs
    <output truncated>
    

    Notice that the Backing filesystem in the output above is showing asextfs. Multiple backing filesystems are supported but extfs (ext4) isrecommended for production use cases.

Your Docker host is now using the overlay storage driver. If you run themount command, you’ll find Docker has automatically created the overlaymount with the required “lowerdir”, “upperdir”, “merged” and “workdir”constructs.

OverlayFS and Docker Performance

As a general rule, the overlay driver should be fast. Almost certainly faster than aufs and devicemapper. In certain circumstances it may also be faster than btrfs. That said, there are a few things to be aware of relative to the performance of Docker using the overlay storage driver.

  • Page Caching. OverlayFS supports page cache sharing. This means multiplecontainers accessing the same file can share a single page cache entry (orentries). This makes the overlay driver efficient with memory and a goodoption for PaaS and other high density use cases.

  • copy_up. As with AUFS, OverlayFS has to perform copy-up operations anytime a container writes to a file for the first time. This can insert latencyinto the write operation — especially if the file being copied up islarge. However, once the file has been copied up, all subsequent writes to thatfile occur without the need for further copy-up operations.

    The OverlayFS copy_up operation should be faster than the same operationwith AUFS. This is because AUFS supports more layers than OverlayFS and it ispossible to incur far larger latencies if searching through many AUFS layers.

  • RPMs and Yum. OverlayFS only implements a subset of the POSIX standards.This can result in certain OverlayFS operations breaking POSIX standards. Onesuch operation is the copy-up operation. Therefore, using yum inside of acontainer on a Docker host using the overlay storage driver is unlikely towork without implementing workarounds.

  • Inode limits. Use of the overlay storage driver can cause excessiveinode consumption. This is especially so as the number of images and containerson the Docker host grows. A Docker host with a large number of images and lotsof started and stopped containers can quickly run out of inodes.

Unfortunately you can only specify the number of inodes in a filesystem at thetime of creation. For this reason, you may wish to consider putting/var/lib/docker on a separate device with its own filesystem, or manuallyspecifying the number of inodes when creating the filesystem.

The following generic performance best practices also apply to OverlayFS.

  • Solid State Devices (SSD). For best performance it is always a good ideato use fast storage media such as solid state devices (SSD).

  • Use Data Volumes. Data volumes provide the best and most predictableperformance. This is because they bypass the storage driver and do not incurany of the potential overheads introduced by thin provisioning andcopy-on-write. For this reason, you should place heavy write workloads on datavolumes.


posted @ 2016-05-03 15:51  张同光  阅读(125)  评论(0编辑  收藏  举报