Designing Scalable, Portable Docker Container Networks

Designing Scalable, Portable Docker Container Networks

Docker host Network Driver

host network

#Create containers on the host network
$ docker run -itd --net host --name C1 alpine sh
$ docker run -itd --net host --name nginx

#Show host eth0
$ ip add | grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 172.31.21.213/20 brd 172.31.31.255 scope global eth0

#Show eth0 from C1
$ docker run -it --net host --name C1 alpine ip add | grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP qlen 1000
    inet 172.31.21.213/20 brd 172.31.31.255 scope global eth0

#Contact the nginx container through localhost on C1
$ curl localhost
!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...

Docker Bridge Network Driver

  1. bridge is the name of the Docker network
  2. bridge is the network driver, or template, from which this network is created
  3. docker0 is the name of the Linux bridge that is the kernel building block used toimplement this network

default docker brridge network

User-Defined Bridge Networks

User-Defined Bridge Networks

$ docker network create -d bridge --subnet 10.0.0.0/24 my_bridge
$ docker run -itd --name c2 --net my_bridge busybox sh
$ docker run -itd --name c3 --net my_bridge --ip 10.0.0.254 busybox sh
$ brctl show
bridge name      bridge id            STP enabled    interfaces
br-b5db4578d8c9  8000.02428d936bb1    no             vethc9b3282
                                                     vethf3ba8b5
docker0          8000.0242504b5200    no             vethb64e8b8

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
b5db4578d8c9        my_bridge           bridge              local
e1cac9da3116        bridge              bridge              local
...

External Access for Standalone Containers

External Access for Standalone Containers

Overlay Driver Network Architecture

The native Docker overlay network driver radically simplifies many of the challenges in multi-host networking. With the overlay driver, multi-host networks are first-class citizens inside Docker without external provisioning or components. overlay uses the Swarm-distributed control plane to provide centralized management, stability, and security across very large scale clusters.
enter image description here

In this diagram, the packet flow on an overlay network is shown. Here are the steps that take place when c1 sends c2 packets across their shared overlay network:

  • c1 does a DNS lookup for c2.Since both containers are on the same overlay network the Docker Engine local DNS server resolves c2 to its overlay IP address 10.0.0.3
  • An overlay network is a L2 segment so c1 generates an L2 frame destined for the MAC address of c2.
  • The frame is encapsulated with a VXLAN header by the overlay network driver. The distributed overlay control plane manages the locations and state of each VXLAN tunnel endpoint so it knows that c2 resides on host-B at the physical address of 192.168.0.3. That address becomes the destination address of the underlay IP header.
  • Once encapsulated the packet is sent. The physical network is responsible of routing or bridging the VXLAN packet to the correct host.
  • The packet arrives at the eth0 interface of host-B and is decapsulated by the overlay network driver. The original L2 frame from c1 is passed to c2's eth0 interface and up to the listening application.

Overlay Driver Internal Architecture

enter image description here

#Create an overlay named "ovnet" with the overlay driver
$ docker network create -d overlay --subnet 10.1.0.0/24 ovnet

#Create a service from an nginx image and connect it to the "ovnet" overlay network
$ docker service create --network ovnet nginx

When the overlay network is created, notice that several interfaces and bridges are created inside the host as well as two interfaces inside this container.

# Peek into the container of this service to see its internal interfaces
conatiner$ ip address

#docker_gwbridge network     
52: eth0@if55: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
    link/ether 02:42:ac:14:00:06 brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.6/16 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe14:6/64 scope link
       valid_lft forever preferred_lft forever

#overlay network interface
54: eth1@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 
    link/ether 02:42:0a:01:00:03 brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.3/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.1.0.2/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe01:3/64 scope link
       valid_lft forever preferred_lft forever

External Access for Docker Services

  • ingress mode service publish
$ docker service create --replicas 2 --publish mode=ingress,target=80,published=8080 nginx

mode=ingress is the default mode for services. This command can also be accomplished with the shorthand version -p 80:8080. Port 8080 is exposed on every host on the cluster and load balanced to the two containers in this service.

  • host mode service publish
$ docker service create --replicas 2 --publish mode=host,target=80,published=8080 nginx

host mode requires the mode=host flag. It publishes port 8080 locally on the hosts where these two containers are running. It does not apply load balancing, so traffic to those nodes are directed only to the local container. This can cause port collision if there are not enough ports available for the number of replicas.

  • ingress design
    enter image description here

MACVLAN

enter image description here

#Creation of MACVLAN network "mvnet" bound to eth0 on the host 
$ docker network create -d macvlan --subnet 192.168.0.0/24 --gateway 192.168.0.1 -o parent=eth0 mvnet

#Creation of containers on the "mvnet" network
$ docker run -itd --name c1 --net mvnet --ip 192.168.0.3 busybox sh
$ docker run -it --name c2 --net mvnet --ip 192.168.0.4 busybox sh
/ # ping 192.168.0.3
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.052 ms

As you can see in this diagram, c1and c2 are attached via the MACVLAN network called macvlan attached to eth0 on the host.

VLAN Trunking with MACVLAN

enter image description here

#Creation of  macvlan10 network in VLAN 10
$ docker network create -d macvlan --subnet 192.168.10.0/24 --gateway 192.168.10.1 -o parent=eth0.10 macvlan10

#Creation of  macvlan20 network in VLAN 20
$ docker network create -d macvlan --subnet 192.168.20.0/24 --gateway 192.168.20.1 -o parent=eth0.20 macvlan20

#Creation of containers on separate MACVLAN networks
$ docker run -itd --name c1--net macvlan10 --ip 192.168.10.2 busybox sh
$ docker run -it --name c2--net macvlan20 --ip 192.168.20.2 busybox sh

In the preceding configuration we've created two separate networks using the macvlan driver that are configured to use a sub-interface as their parent interface. The macvlan driver creates the sub-interfaces and connects them between the host's eth0 and the container interfaces. The host interface and upstream switch must be set to switchport mode trunk so that VLANs are tagged going across the interface. One or more containers can be connected to a given MACVLAN network to create complex network policies that are segmented via L2.

Swarm Native Service Discovery

Docker uses embedded DNS to provide service discovery for containers running on a single Docker Engine and tasks running in a Docker Swarm. Docker Engine has an internal DNS server that provides name resolution to all of the containers on the host in user-defined bridge, overlay, and MACVLAN networks. Each Docker container ( or task in Swarm mode) has a DNS resolver that forwards DNS queries to Docker Engine, which acts as a DNS server. Docker Engine then checks if the DNS query belongs to a container or service on network(s) that the requesting container belongs to. If it does, then Docker Engine looks up the IP address that matches a container, task, orservice's name in its key-value store and returns that IP or service Virtual IP (VIP) back to the requester.

Service discovery is network-scoped, meaning only containers or tasks that are on the same network can use the embedded DNS functionality. Containers not on the same network cannot resolve each other's addresses. Additionally, only the nodes that have containers or tasks on a particular network store that network's DNS entries. This promotes security and performance.

If the destination container or service does not belong on the same network(s) as the source container, then Docker Engine forwards the DNS query to the configured default DNS server.

enter image description here

In this example there is a service of two containers called myservice. A second service (client) exists on the same network. The client executes two curl operations for docker.com and myservice. These are the resulting actions:

  • DNS queries are initiated by client for docker.com and myservice.
  • The container's built-in resolver intercepts the DNS queries on 127.0.0.11:53 and sends them to Docker Engine's DNS server.
  • myservice resolves to the Virtual IP (VIP) of that service which is internally load balanced to the individual task IP addresses. Container names resolve as well, albeit directly to their IP addresses.
  • docker.com does not exist as a service name in the mynet network and so the request is forwarded to the configured default DNS server.

Docker Native Load Balancing

Internal load balancing is instantiated automatically when Docker services are created. When services are created in a Docker Swarm cluster, they are automatically assigned a Virtual IP (VIP) that is part of the service's network. The VIP is returned when resolving the service's name. Traffic to that VIP is automatically sent to all healthy tasks of that service across the overlay network. This approach avoids any client-side load balancing because only a single IP is returned to the client. Docker takes care of routing and equally distributing the traffic across the healthy service tasks.

UCP Internal Load Balancing

enter image description here
to see the VIP,run a docker service inspect my_service as follows:

# Create an overlay network called mynet
$ docker network create -d overlay mynet
a59umzkdj2r0ua7x8jxd84dhr

# Create myservice with 2 replicas as part of that network
$ docker service create --network mynet --name myservice --replicas 2 busybox ping localhost
8t5r8cr0f0h6k2c3k7ih4l6f5

# See the VIP that was created for that service
$ docker service inspect myservice
...

"VirtualIPs": [
                {
                    "NetworkID": "a59umzkdj2r0ua7x8jxd84dhr",
                    "Addr": "10.0.0.3/24"
                },
]

UCP External L4 Load Balancing (Docker Routing Mesh)

s

This diagram illustrates how the Routing Mesh works.

  • A service is created with two replicas, and it is port mapped externally to port 8000.
  • The routing mesh exposes port 8000 on each host in the cluster.
  • Traffic destined for the app can enter on any host. In this case the external LB sends the traffic to a host without a service replica.
  • The kernel's IPVS load balancer redirects traffic on the ingress overlay network to a healthy service replica.

Bridge Driver on a Single Host

$ docker network create -d bridge petsBridge

$ docker run -d --net petsBridge --name db consul

$ docker run -it --env "DB=db" --net petsBridge --name web -p 8000:5000 chrch/docker-pets:1.0
Starting web container e750c649a6b5
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

When an IP address is not specified, port mapping is exposed on all interfaces of a host. In this case the container's application is exposed on 0.0.0.0:8000. To provide a specific IP address to advertise on use the flag -p IP:host_port:container_port.
enter image description here

The application is exposed locally on this host on port 8000 on all of its interfaces. Also supplied is DB=db, providing the name of the backend container. The Docker Engine's built-in DNS resolves this container name to the IP address of db. Since bridge is a local driver, the scope of DNS resolution is only on a single host.

The output below shows us that our containers have been assigned private IPs from the 172.19.0.0/24 IP space of the petsBridge network. Docker uses the built-in IPAM driver to provide an IP from the appropriate subnet if no other IPAM driver is specified.

$ docker inspect --format {{.NetworkSettings.Networks.petsBridge.IPAddress}} web
172.19.0.3

$ docker inspect --format {{.NetworkSettings.Networks.petsBridge.IPAddress}} db
172.19.0.2
posted @ 2017-09-13 19:44  ethan2lee  阅读(380)  评论(0编辑  收藏  举报