nvgre

GRE RFC2784 工作原理

Structure of a GRE Encapsulated Packet



   A GRE encapsulated packet has the form:

    ---------------------------------
    |                               |
    |       Delivery Header         |
    |                               |
    ---------------------------------
    |                               |
    |       GRE Header              |
    |                               |
    ---------------------------------
    |                               |
    |       Payload packet          |
    |                               |
    ---------------------------------

   This specification is generally concerned with the structure of the
   GRE header, although special consideration is given to some of the
   issues surrounding IPv4 payloads.

GRE Header



   The GRE packet header has the form:

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |C|       Reserved0       | Ver |         Protocol Type         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |      Checksum (optional)      |       Reserved1 (Optional)    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

Key and Sequence Number Extensions to GRE RFC2890

 

Extensions to GRE Header



   The GRE packet header[1] has the following format:

     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |C|       Reserved0       | Ver |         Protocol Type         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |      Checksum (optional)      |       Reserved1 (Optional)    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The proposed GRE header will have the following format:

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |C| |K|S| Reserved0       | Ver |         Protocol Type         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Checksum (optional)      |       Reserved1 (Optional)    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Key (optional)                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Sequence Number (Optional)                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


     Key Present (bit 2)

     If the Key Present bit is set to 1, then it indicates that the
     Key field is present in the GRE header.  Otherwise, the Key
     field is not present in the GRE header.

     Sequence Number Present (bit 3)

     If the Sequence Number Present bit is set to 1, then it
     indicates that the Sequence Number field is present.
     Otherwise, the Sequence Number field is not present in the GRE
     header.

     The Key and the Sequence Present bits are chosen to be
     compatible with RFC 1701 [2].

 

NVGRE RFC 7637

NVGRE是一个由RFC 2784定义和RFC 2890扩展的通道协议微软的blog

Outer Ethernet Header:
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                (Outer) Destination MAC Address                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |(Outer)Destination MAC Address |  (Outer)Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  (Outer) Source MAC Address                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Optional Ethertype=C-Tag 802.1Q| Outer VLAN Tag Information    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Ethertype 0x0800        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Outer IPv4 Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version|  HL   |Type of Service|          Total Length         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Identification        |Flags|      Fragment Offset    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Time to Live | Protocol 0x2F |         Header Checksum       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      (Outer) Source Address                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  (Outer) Destination Address                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

GRE Header: 

key was set to 1
   Protocol Type field in the GRE header is set to 0x6558
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0| |1|0|   Reserved0     | Ver |   Protocol Type 0x6558        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               Virtual Subnet ID (VSID)        |    FlowID     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Inner Ethernet Header 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                (Inner) Destination MAC Address                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |(Inner)Destination MAC Address |  (Inner)Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  (Inner) Source MAC Address                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Ethertype 0x0800        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Inner IPv4 Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version|  HL   |Type of Service|          Total Length         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Identification        |Flags|      Fragment Offset    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Time to Live |    Protocol   |         Header Checksum       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       Source Address                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Destination Address                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Original IP Payload                      |
   |                                                               |
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure 1: GRE Encapsulation Frame Format

The best starting place is to first layout the addressing scheme for IP addresses and subnets that you'd like to virtualize.  When configuring Hyper-V Network Virtualization, there are two types of IP Addresses that you'll be interacting with:

  • Provider Addresses (PA) - these are unique IP addresses assigned to each Hyper-V host that are routable across the physical network infrastructure.  I like to think of "PA" addresses as "Physical Addresses", because they are assigned to physical Hyper-V hosts.  Each Hyper-V host requires at least one PA to be assigned.
     
  • Customer Addresses (CA) - these are unique IP addresses assigned to each Virtual Machine that will be participating on a virtualized network.  I like to think of "CA" addresses as "Container Addresses", because they are the IP Addresses assigned to each VM "container" for use by the guest operating system running inside that VM.  Using NVGRE, multiple CA's for VMs running on a Hyper-V host can be tunneled using a single PA on that Hyper-V host.  CA's must be unique across all VMs on the same virtualized network, but CA's do not need to be unique across virtualized networks (such as in multi-tenant scenarios where each customer's VMs are isolated on separate virtualized networks).

Let's look at a simple example of NVGRE with two Hyper-V hosts using PA's and CA's:

In this example, you'll note that each Hyper-V host is assigned one PA address ( e.g., 192.168.x.x ) used for tunneling NVGRE traffic across two physical subnets ( e.g., 192.168.1.x/24 and 192.168.2.x/24 ) on the physical network.  In addition, each VM is assigned a CA address ( e.g., 10.x.x.x ) that is unique within each virtualized network and is tunneled inside the NVGRE tunnel between hosts. 

To separate the traffic between the two virtualized networks, the GRE headers on the tunneled packets include a GRE Key that provides a unique Virtual Subnet ID ( e.g., 5001 and 6001 ) for each virtualized network. 

Based on this configuration, we have two virtualized networks ( e.g., the "Red" network and the "Blue" network ) that are isolated from one another as separate IP networks and extended across two physical Hyper-V hosts located on two different physical subnets.

Once you have the following defined for your environment in a worksheet, you're ready to move on to the next steps in configuring Hyper-V Network Virtualization:

    • PA's for each Hyper-V Host
    • CA's for each Virtual Machine
    • Virtual Subnet ID's for each subnet to be virtualized

Neutron 理解 (3): Open vSwitch + GRE/VxLAN 组网 [Netruon Open vSwitch + GRE/VxLAN Virutal Network]

Tunneling And Network Virtualization: NVGRE, VXLAN

 

Demo:

如何在Linux环境创建GRE Tunnel

Using GRE Tunnels with Open vSwitch

普通的GRE  应该是需要arp代理吗?

 script: gre.sh 

 

#!/bin/bash
# sudo apt install bridge-utils
REMOTE_IP=$1
SUBNET=$2  # HOST1: 192.168.0.1, HOST2: 192.169.0.1
GREIP=$3   # HOST1: 10.10.10.1, HOST2: 10.10.10.2
R_GREIP=$4 # HOST1: 10.10.10.2, HOST2: 10.10.10.1
DEV=$5
LOCAL_IP=`ip addr show  $DEV| awk '/inet /{split($2,a,"/"); print a[1]}'`
sudo ip tunnel add gre1 mode gre remote $REMOTE_IP local $LOCAL_IP ttl 255
sudo ip link set gre1 up

sudo ip addr add $GREIP/24 dev gre1
# sudo ip route add ${SUBNET%.*}.0/24 via $R_GREIP dev gre1  # 不能工作
sudo ip route add ${SUBNET%.*}.0/24 dev gre1 #为 gre 添加ip
sudo echo 1 > /proc/sys/net/ipv4/ip_forward #让服务器支持转发
# HOST1
sudo iptables -t nat -A POSTROUTING -d  ${SUBNET%.*}.0/24 -j SNAT --to $GREIP#否则访问 ${SUBNET%.*}.0/24网段不通

# HOST2
iptables -t nat -A POSTROUTING -s $GREIP -d ${SUBNET%.*}.0/24  -j SNAT --to $SUBNET #否则192.168.1.X等机器访问10.1.1.x网段不通
iptables -A FORWARD -s $GREIP -m state --state NEW -m tcp -p tcp --dport 3306 -j DROP #禁止直接访问线上的3306,防止内网被破

sudo brctl addbr br1  # sudo ifconfig br1 192.169.0.7/24
sudo ip link set br1 up
# sudo brctl addif br1 gre1 # 不能工作

sudo ip link add type veth
sudo ifconfig veth0 ${SUBNET%.*}.7/24 up
sudo ifconfig veth0 mtu 1450
sudo ifconfig veth1 up
sudo ifconfig veth1 mtu 1450
sudo brctl addif br1 veth1

ip route show

 

on host 1:   $ ./gre.sh 10.0.0.52  192.168.0.1 10.10.10.1 10.10.10.2 ens3

on host 2:   $ ./gre.sh 10.0.0.32  192.169.0.1 10.10.10.2 10.10.10.1 ens3

 

on host 1

sudo ovs-vsctl add-br br0
sudo ovs-vsctl add-port br0 tep0 -- set interface tep0 type=internal
sudo ifconfig tep0 192.168.200.20 netmask 255.255.255.0
sudo ovs-vsctl add-br br2
sudo ovs-vsctl add-port br2 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.200.21
route

 

# ip link add br0 type bridge

sudo ip tuntap add mode tap

sudo ifconfig tap0 192.168.200.20 netmask 255.255.255.0

sudo ip link set tap0 up

sudo ip link set br1 up

sudo brctl addif br1 tap0

sudo brctl addif br1 ens3  # 该命令会导致网络访问不了

 

sudo ip link add type veth
sudo ifconfig veth0 192.167.0.6/24 up
sudo ifconfig veth0 mtu 1450
sudo ifconfig veth1 up
sudo ifconfig veth1 mtu 1450
sudo ovs-vsctl add-port br2 veth1

 

$ sudo ovs-vsctl add-port br0 ens3  # 该命令会导致网络访问不了

on host 2

sudo ovs-vsctl add-br br0
sudo ovs-vsctl add-port br0 tep0 -- set interface tep0 type=internal
sudo ifconfig tep0 192.168.200.21 netmask 255.255.255.0
sudo ovs-vsctl add-br br2
sudo ovs-vsctl add-port br2 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.200.20
route

 

$ sudo ovs-vsctl show
ffb98c3f-a7a4-4287-b84a-c7c2b2616c72
    Bridge "br0"
        Port "tep0"
            Interface "tep0"
                type: internal
        Port "br0"
            Interface "br0"
                type: internal
    Bridge "br2"
        Port "br2"
            Interface "br2"
                type: internal
        Port "gre0"
            Interface "gre0"
                type: gre
                options: {remote_ip="192.168.200.21"}
    ovs_version: "2.5.2"
$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         localhost       0.0.0.0         UG    0      0        0 ens3
10.0.0.0        *               255.255.255.0   U     0      0        0 ens3
169.254.169.254 localhost       255.255.255.255 UGH   0      0        0 ens3
192.168.200.0   *               255.255.255.0   U     0      0        0 tep0

 

$ sudo ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether fa:16:3e:88:b0:29 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/ether b6:98:ba:ee:7d:b6 brd ff:ff:ff:ff:ff:ff
4: br2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/ether a2:58:66:5a:94:4a brd ff:ff:ff:ff:ff:ff
5: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/ether 3e:2f:8d:26:56:47 brd ff:ff:ff:ff:ff:ff
6: tep0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/ether 62:32:8c:1d:2b:99 brd ff:ff:ff:ff:ff:ff
7: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/gre 0.0.0.0 brd 0.0.0.0
8: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
9: gre_sys@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65490 qdisc pfifo_fast master ovs-system state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether a6:ee:6f:a2:0e:22 brd ff:ff:ff:ff:ff:ff

 

$ sudo ip addr show                                                                                 [7/841]
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:88:b0:29 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.54/24 brd 10.0.0.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe88:b029/64 scope link
       valid_lft forever preferred_lft forever
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1
    link/ether b6:98:ba:ee:7d:b6 brd ff:ff:ff:ff:ff:ff
4: br2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1
    link/ether a2:58:66:5a:94:4a brd ff:ff:ff:ff:ff:ff
5: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1
    link/ether 3e:2f:8d:26:56:47 brd ff:ff:ff:ff:ff:ff
6: tep0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1
    link/ether 62:32:8c:1d:2b:99 brd ff:ff:ff:ff:ff:ff
    inet 192.168.200.20/24 brd 192.168.200.255 scope global tep0
       valid_lft forever preferred_lft forever
    inet6 fe80::6032:8cff:fe1d:2b99/64 scope link
       valid_lft forever preferred_lft forever
7: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN group default qlen 1
    link/gre 0.0.0.0 brd 0.0.0.0
8: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
9: gre_sys@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65490 qdisc pfifo_fast master ovs-system state UNKNOWN group default qlen 1000
    link/ether a6:ee:6f:a2:0e:22 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::a4ee:6fff:fea2:e22/64 scope link
       valid_lft forever preferred_lft forever

 

$ sudo ovs-vsctl add-port br0 ens3   # 该命令会导致网络访问不了

sudo ip link add type veth
sudo ifconfig veth0 192.167.0.6/24 up
sudo ifconfig veth0 mtu 1450
sudo ifconfig veth1 up
sudo ifconfig veth1 mtu 1450
sudo ovs-vsctl add-port br2 veth1

 

  

$ ip link help
...
TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | macvtap |
bridge | bond | ipoib | ip6tnl | ipip | sit | vxlan |
gre | gretap | ip6gre | ip6gretap | vti | nlmon |
bond_slave | ipvlan | geneve | bridge_slave | vrf }

 

深入理解 GRE tunnel GRE 与IPIP的区别。 ipip tunnel 是端对端的,通信也就只能是点对点的,而 GRE tunnel 却可以进行多播。

 该ppt中内置了GRE和IPIP的包, 可供大家分析。

posted @ 2017-12-23 02:59  lvmxh  阅读(1525)  评论(0编辑  收藏  举报