目录

  1. DHCP流量分析
    1. 环境
    2. 流量分析
      1. Request阶段
      2. 计算节点
        1. 抓包请求
        2. br-int流表分析
          1. Table 0
          2. Table 25
          3. Table 60
        3. br-tun流表分析
          1. Table 0
          2. Table 2
          3. Table 22
      3. 控制节点
        1. br-tun流表
          1. Table 0
          2. Table 4
          3. Table 10
        2. br-int流表
          1. Table 0
          2. Table 60
      4. DHCP网络命名空间

DHCP流量分析

DHCP客户端从DHCP服务器获取IP地址,主要通过四个阶段进行:

  1. 发现阶段,即DHCP客户端寻找DHCP服务器的阶段。客户端以广播方式发送DHCP-DISCOVER报文。
  2. 提供阶段,即DHCP服务器提供IP地址的阶段。DHCP服务器接收到客户端的DHCP-DISCOVER报文后,根据IP地址分配的优先次序选出一个IP地址,与其他参数一起通过DHCP-OFFER报文发送给客户端。
  3. 选择阶段,即DHCP客户端选择IP地址的阶段。如果有多台DHCP服务器向该客户端发来DHCP-OFFER 报文,客户端只接受第一个收到的DHCP-OFFER报文,然后以广播方式发送DHCP-REQUEST报文,该报文中包含DHCP服务器在DHCP-OFFER报文中分配的IP地址。
  4. 确认阶段,即DHCP服务器确认IP地址的阶段。DHCP服务器收到DHCP客户端发来的DHCP-REQUEST 报文后,只有DHCP客户端选择的服务器会进行如下操作:如果确认将地址分配给该客户端,则返回DHCP-ACK报文;否则返回DHCP-NAK报文,表明地址不能分配给该客户端。

一般的,DHCP服务器分配给客户端的IP地址具有一定的租借期限(除自动分配的IP地址),该租借期限称为租约。当租借期满后服务器会收回该IP地址。如果DHCP客户端希望继续使用该地址,则DHCP客户端需要申请延长IP地址租约。

  1. 在DHCP客户端的IP地址租约期限达到一半左右时间时,DHCP客户端会向为它分配IP地址的DHCP服务器单播发送DHCP-REQUEST报文,以进行IP租约的更新。
  2. 如果客户端可以继续使用此IP地址,则DHCP服务器回应DHCP-ACK报文,通知DHCP客户端已经获得新IP租约;如果此IP地址不可以再分配给该客户端,则DHCP服务器回应DHCP-NAK报文,通知DHCP客户端不能获得新的租约。

下面主要针对申请延长IP地址租约的情况,做出流量的分析。

环境

云主机网卡

1
2
3
4
5
6
7
8
9
ip a

ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:01:c2:a6 brd ff:ff:ff:ff:ff:ff
altname enp0s4
inet 172.16.0.193/24 brd 172.16.0.255 scope global dynamic noprefixroute ens4
valid_lft 85908sec preferred_lft 85908sec
inet6 fe80::61e4:d0ce:a45c:94ef/64 scope link noprefixroute
valid_lft forever preferred_lft forever

云主机port

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// openstack port show da24f6b9-4f66-488c-bab0-0bd5a9862f0a -f json
{
"admin_state_up": true,
"allowed_address_pairs": [],
"binding_host_id": "compute1",
"binding_profile": {},
"binding_vif_details": {
"connectivity": "l2",
"port_filter": true,
"ovs_hybrid_plug": true,
"datapath_type": "system",
"bridge_name": "br-int"
},
"binding_vif_type": "ovs",
"binding_vnic_type": "normal",
"created_at": "2022-09-26T01:34:07Z",
"data_plane_status": null,
"description": "",
"device_id": "30a57936-d0b4-446a-82df-41eb9681e8cc",
"device_owner": "compute:nova",
"dns_assignment": null,
"dns_domain": null,
"dns_name": null,
"extra_dhcp_opts": [],
"fixed_ips": [
{
"subnet_id": "c4758750-17c5-494a-88c5-ebac4558777e",
"ip_address": "172.16.0.193"
}
],
"id": "da24f6b9-4f66-488c-bab0-0bd5a9862f0a",
"ip_allocation": null,
"mac_address": "fa:16:3e:01:c2:a6",
"name": "",
"network_id": "7b6c69b5-bfa5-4145-901c-26979103da77",
"numa_affinity_policy": null,
"port_security_enabled": true,
"project_id": "8e4be2cb25084defb5a9f0128afd5291",
"propagate_uplink_status": null,
"qos_network_policy_id": null,
"qos_policy_id": null,
"resource_request": null,
"revision_number": 5,
"security_group_ids": [
"238bde14-5bbe-4924-9ef2-3a689353148f"
],
"status": "ACTIVE",
"tags": [],
"trunk_details": null,
"updated_at": "2022-09-26T01:41:04Z"
}

云主机网络ID

1
2
3
{
"network_id": "7b6c69b5-bfa5-4145-901c-26979103da77"
}

DHCP的网络命名空间以及网络设备。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ip netns exec qdhcp-7b6c69b5-bfa5-4145-901c-26979103da77 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
66: tap7b7a17cf-0e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether fa:16:3e:a4:6f:40 brd ff:ff:ff:ff:ff:ff
inet 172.16.0.2/24 brd 172.16.0.255 scope global tap7b7a17cf-0e
valid_lft forever preferred_lft forever
inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fea4:6f40/64 scope link
valid_lft forever preferred_lft forever

由port的ID可以得知。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# linux网桥
qbrda24f6b9-4f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
link/ether b6:99:35:a2:2a:ea brd ff:ff:ff:ff:ff:ff

# ovs上的port
qvoda24f6b9-4f@qvbda24f6b9-4f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default qlen 1000
link/ether 6e:17:ad:a7:83:d3 brd ff:ff:ff:ff:ff:ff
inet6 fe80::6c17:adff:fea7:83d3/64 scope link
valid_lft forever preferred_lft forever

# linux网桥上的dev
qvbda24f6b9-4f@qvoda24f6b9-4f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master qbrda24f6b9-4f state UP group default qlen 1000
link/ether b6:99:35:a2:2a:ea brd ff:ff:ff:ff:ff:ff
inet6 fe80::b499:35ff:fea2:2aea/64 scope link
valid_lft forever preferred_lft forever

# tap设备
tapda24f6b9-4f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel master qbrda24f6b9-4f state UNKNOWN group default qlen 1000
link/ether fe:16:3e:01:c2:a6 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fe01:c2a6/64 scope link
valid_lft forever preferred_lft forever

利用ovs-vsctl命令查看qvoda24f6b9-4f相关信息。

1
2
3
4
5
6
7
ovs-vsctl show

Bridge br-int
Port qvoda24f6b9-4f
# tag就是vlan tag
tag: 9
Interface qvoda24f6b9-4f

流量分析

云主机内使用dhclient命令发送dhcp请求。

1
dhclient ens4

Request阶段

DHCP客户端向服务端发送DHCP REQUEST报文,流量从云主机流向DHCP服务器。

计算节点

抓包请求

对计算节点上的tap设备进行tcpdump抓包。

1
2
3
4
5
6
tcpdump -i tapda24f6b9-4f udp -n

listening on tapda24f6b9-4f, link-type EN10MB (Ethernet), capture size 262144 bytes
09:55:23.628102 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:01:c2:a6, length 300
09:55:23.633056 IP 172.16.0.2.bootps > 172.16.0.193.bootpc: BOOTP/DHCP, Reply, length 363

fe:16:3e:01:c2:a6是云主机网卡的MAC地址。

然后在linux网桥上抓包。

1
2
3
4
5
6
tcpdump -i qbrda24f6b9-4f udp -n

listening on qbrda24f6b9-4f, link-type EN10MB (Ethernet), capture size 262144 bytes
09:56:28.704214 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:01:c2:a6, length 300
09:56:28.707607 IP 172.16.0.2.bootps > 172.16.0.193.bootpc: BOOTP/DHCP, Reply, length 363

在linux网桥的设备上抓包。

1
2
3
4
5
6
tcpdump -i qvbda24f6b9-4f udp -n

listening on qvbda24f6b9-4f, link-type EN10MB (Ethernet), capture size 262144 bytes
09:57:23.918547 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:01:c2:a6, length 300
09:57:23.921620 IP 172.16.0.2.bootps > 172.16.0.193.bootpc: BOOTP/DHCP, Reply, length 363

然后在ovs上的port上抓包。

1
2
3
4
5
6
tcpdump -i qvoda24f6b9-4f udp -n

listening on qvoda24f6b9-4f, link-type EN10MB (Ethernet), capture size 262144 bytes
09:58:06.299939 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:01:c2:a6, length 300
09:58:06.303494 IP 172.16.0.2.bootps > 172.16.0.193.bootpc: BOOTP/DHCP, Reply, length 363

br-int流表分析

dhcp请求进入ovs网桥br-int上,利用ovs命令查看br-int的流表。

1
ovs-ofctl dump-flows br-int

br-int流表如下(已省略部分规则)。

Table 0
1
2
3
4
5
6
cookie=0x7cf0a52c49ad76d9, duration=1096.467s, table=0, n_packets=0, n_bytes=0, priority=10,icmp6,in_port="qvoda24f6b9-4f",icmp_type=136 actions=resubmit(,24)

cookie=0x7cf0a52c49ad76d9, duration=1096.463s, table=0, n_packets=5, n_bytes=210, priority=10,arp,in_port="qvoda24f6b9-4f" actions=resubmit(,24)

cookie=0x7cf0a52c49ad76d9, duration=1096.483s, table=0, n_packets=84, n_bytes=8876, priority=9,in_port="qvoda24f6b9-4f" actions=resubmit(,25)

in_port,匹配进入ovs网桥的端口,这里是qvoda24f6b9-4f。
dhcp报文使用的是udp协议,所以按照流表规则应该发送到Table 25。

Table 25
1
2
cookie=0x7cf0a52c49ad76d9, duration=1096.493s, table=25, n_packets=70, n_bytes=6828, priority=2,in_port="qvoda24f6b9-4f",dl_src=fa:16:3e:01:c2:a6 actions=resubmit(,60)

dl_src,匹配源MAC地址,这里是fa:16:3e:01:c2:a6,也就是云主机的MAC地址。
按照流表规则应该发送到Table 60。

Table 60
1
cookie=0x7cf0a52c49ad76d9, duration=241159.738s, table=60, n_packets=5071109852, n_bytes=3104632181631, priority=3 actions=NORMAL

按照流表规则进行Normal处理,也就是将报文从br-int上的patch-tun发出。

br-tun流表分析

紧接上面说的,报文从br-int上的patch-tun发出,由于patch-tun和patch-int这对veth pair的存在,报文自然而然发送到br-tun的patch-int上。

1
2
3
4
5
6
7
8
9
10
11
12
13
ovs-vsctl show

Bridge br-int
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}

Bridge br-tun
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}

查看br-tun的流表。

1
ovs-ofctl dump-flows br-tun

br-tun流表如下(已省略部分规则)。

Table 0
1
cookie=0xc8bd5b0435a0830b, duration=6952.218s, table=0, n_packets=540130006, n_bytes=318000930632, priority=1,in_port="patch-int" actions=resubmit(,2)

in_port,匹配上文提到的patch-int。
按照流表规则应该发送到Table 2。

Table 2
1
2
3
cookie=0xc8bd5b0435a0830b, duration=6952.216s, table=2, n_packets=866663, n_bytes=36404220, priority=1,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=resubmit(,21)
cookie=0xc8bd5b0435a0830b, duration=6952.214s, table=2, n_packets=536889805, n_bytes=317711303942, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
cookie=0xc8bd5b0435a0830b, duration=6952.213s, table=2, n_packets=2373536, n_bytes=253222354, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)

dl_dst,匹配以太网的MAC地址。
其中00:00:00:00:00:00/01:00:00:00:00:00,代表的是单播,也就是一对一。
01:00:00:00:00:00/01:00:00:00:00:00,代表的是广播。

dhcp是客户机先向搜索消息广播到本地子网的广播地址上,所以按照流表规则应该发送到Table 22。

Table 22
1
cookie=0xc8bd5b0435a0830b, duration=9138.161s, table=22, n_packets=99, n_bytes=9602, priority=1,dl_vlan=9 actions=strip_vlan,load:0x8->NXM_NX_TUN_ID[],output:"vxlan-0a0a0f0b"

之前提到过br-int上qvo0f93358c-9e的tag是9,恰好对应vlan的id,所以报文会从vxlan-0a0a0f0b发出。

同时这里会有一个VNI,也就是vxlan的ID,是8。

vxlan-0a0a0f0b是br-tun上的一个port。

1
2
3
4
5
Bridge br-tun
Port vxlan-0a0a0f0b
Interface vxlan-0a0a0f0b
type: vxlan
options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.10.15.12", out_key=flow, remote_ip="10.10.15.11"}

控制节点

上面报文发送到控制节点的br-tun上面。

查看br-tun的port。

1
2
3
4
5
6
7
ovs-vsctl show

Bridge br-tun
Port vxlan-0a0a0f0c
Interface vxlan-0a0a0f0c
type: vxlan
options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.10.15.11", out_key=flow, remote_ip="10.10.15.12"}

可以看到报文由vxlan-0a0a0f0c进入br-tun。

br-tun流表

查看br-tun流表

1
ovs-ofctl dump-flows br-tun

br-tun流表如下(已省略部分规则)。

Table 0
1
2
cookie=0x2f0e5558da81d9b, duration=3888857.445s, table=0, n_packets=6460187689, n_bytes=3882347621874, priority=1,in_port="vxlan-0a0a0f0c" actions=resubmit(,4)

in_port,匹配进入br-tun的port,这里是vxlan-0a0a0f0c,所以报文转发到Table 4。

Table 4
1
2
cookie=0x2f0e5558da81d9b, duration=3888868.347s, table=4, n_packets=10248912, n_bytes=2724392934, priority=1,tun_id=0x8 actions=mod_vlan_vid:15,resubmit(,10)

tun_id,匹配vxlan的ID,VNI,就是在计算节点br-tun流表Table 22中所加入的vxlan id是8。
现将报文的vlan ID改为15,再将其转发给Table 10。

Table 10

table 10的流表规则有且只有一条,就是修改Table 20的流表规则,同时将报文从patch-int发出。

1
2
cookie=0x2f0e5558da81d9b, duration=3888896.210s, table=10, n_packets=14070316277, n_bytes=9137771262107, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0x2f0e5558da81d9b,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:"patch-int"

actions=learn(table=20…),表示将学习到的流表放入Table 20。
NXM_OF_VLAN_TCI[0..11],记录当前数据包的VLAN_ID作为match中的VLAN_ID,也就是上文提到的15。
NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],记录当前数据包的源MAC地址作为match中的目的MAC地址。
load:0->NXM_OF_VLAN_TCI[],表示action中要去掉VLAN_ID。
load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],表示action中要封装隧道,隧道ID为当前隧道ID。
output:NXM_OF_IN_PORT[],表示action中的输出,输出端口为当前数据包的输入端口。

br-int流表

由于patch-int和patch-tun这对veth pair的存在,所以报文由path-tun进入br-int。

查看br-int流表

1
ovs-ofctl dump-flows br-int

br-int流表如下(已省略部分规则)。

Table 0
1
2
cookie=0xa39f8d6fcda0e990, duration=3933781.023s, table=0, n_packets=17446431423, n_bytes=14674327912459, priority=0 actions=resubmit(,60)

将报文发送给Table 60。

Table 60
1
2
cookie=0xa39f8d6fcda0e990, duration=3933781.022s, table=60, n_packets=33187444161, n_bytes=26145058466699, priority=3 actions=NORMAL

报文执行Normal,就是按照vlan id正常转发,也就是将报文发送给vlan ID是15的port。

DHCP网络命名空间

上面说到报文发送给vlan ID是15的port,查看br-int上的port,可以得到tap7b7a17cf-0e。

1
2
3
4
5
6
7
ovs-vsctl show 

Bridge br-int
Port tap7b7a17cf-0e
tag: 15
Interface tap7b7a17cf-0e
type: internal

tap7b7a17cf-0e其实就是ovs在dhcp网络命名空间中创建的一块虚拟网卡。
DHCP的网络命名空间以及网络设备。

1
2
3
4
5
6
7
8
9
ip netns exec qdhcp-7b6c69b5-bfa5-4145-901c-26979103da77 ip a
66: tap7b7a17cf-0e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether fa:16:3e:a4:6f:40 brd ff:ff:ff:ff:ff:ff
inet 172.16.0.2/24 brd 172.16.0.255 scope global tap7b7a17cf-0e
valid_lft forever preferred_lft forever
inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fea4:6f40/64 scope link
valid_lft forever preferred_lft forever

我们可以在tap7b7a17cf-0e设备上抓取DHCP的报文。

1
2
3
4
5
6
ip netns exec qdhcp-7b6c69b5-bfa5-4145-901c-26979103da77 tcpdump -i tap7b7a17cf-0e udp -n

listening on tap7b7a17cf-0e, link-type EN10MB (Ethernet), capture size 262144 bytes
10:19:35.234314 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:f0:6c:c9, length 300
10:19:35.236297 IP 172.16.0.2.bootps > 172.16.0.77.bootpc: BOOTP/DHCP, Reply, length 346

可以看到tap7b7a17cf-0e收到报文后并给予回复。

1
2
3
ovs-appctl ofproto/trace br-int in_port=qvoda24f6b9-4f,udp,dl_src=fa:16:3e:01:c2:a6,dl_dst=ff:ff:ff:ff:ff:ff,nw_dst=255.255.255.255,udp_dst=67,udp_src=68

ovs-appctl ofproto/trace br-int in_port=qvo501b2fa7-e3,udp,dl_src=fa:16:3e:d3:6f:9d,dl_dst=ff:ff:ff:ff:ff:ff,nw_dst=255.255.255.255,udp_dst=67,udp_src=68

compute

1
2
3
ovs-appctl ofproto/trace br-int in_port=qvo42a41a92-b5,udp,dl_src=fa:16:3e:18:44:cd,dl_dst=ff:ff:ff:ff:ff:ff,nw_dst=255.255.255.255,udp_dst=67,udp_src=68

ovs-appctl ofproto/trace br-tun in_port=vxlan-0a0a0f0c,udp,dl_src=fa:16:3e:18:44:cd,dl_dst=ff:ff:ff:ff:ff:ff,nw_dst=255.255.255.255,udp_dst=67,udp_src=68
1
ovs-appctl ofproto/trace br-ex in_port=br-ex,udp,dl_src=fa:16:3e:18:44:cd,dl_dst=ff:ff:ff:ff:ff:ff,nw_dst=255.255.255.255,udp_dst=67,udp_src=68