Press "Enter" to skip to content

SONiC与P4仿真实验

SONiC Wiki中关于SONiC与P4的联动仿真实验(https://github.com/Azure/SONiC/wiki/SONiC-P4-Software-Switch)

该实验对了解SONiC和P4都有一定的帮助,之前一直用Barefoot原生Switchd和Stratum,所以借助这个实验来进一步学习一下SONiC,后面有机会再尝试用真实的P4设备跑SONiC对比一下。

SONiC-P4是一个软件交换机,其运行在P4模拟SAI行为模型软件交换机ASCI之上,SONiC-P4以docker容器部署,本实验也会以docker容器的方式运行SONiC-P4作为软件交换机。

网络拓扑

image-20220128150427960

  1. switch1、switch2、host1、host2都以docker容器运行;
  2. 各容器的连接通过ovs中网桥连通

运行

  1. 下载仿真需要的文件sonic-p4-test.zip,解压出来,里面包含容器运行需要的脚本及配置文件等;

  2. 运行./install_docker_ovs.sh安装docker和ovs,若系统已安装则忽略这步;

  3. 原教程中使用./load_image.sh来从sonic jenkins上下载sonic-p4对应docker镜像已经不能使用,这里我发现docker官仓中有这个镜像就直接从docker仓库下载了,运行docker pull alihasanahmedkhan/docker-sonic-p4下载镜像,若本地没有ubuntu 14.04镜像,运行docker pull ubuntu:14.04下载镜像;

  4. 运行./start.sh运行容器并连接网络拓扑,成功后可看到类似如下容器已启动:

    xxx@ubuntu:~$ docker ps
    CONTAINER ID        IMAGE                                      COMMAND             CREATED             STATUS              PORTS               NAMES
    2d8f8272ba4f        ubuntu:14.04                               "/bin/bash"         23 hours ago        Up 23 hours                             host2
    dc30fe84ac1e        ubuntu:14.04                               "/bin/bash"         23 hours ago        Up 23 hours                             host1
    4fe94a1879fd        alihasanahmedkhan/docker-sonic-p4:latest   "/bin/bash"         23 hours ago        Up 23 hours                             switch2
    9acb9dfc3696        alihasanahmedkhan/docker-sonic-p4:latest   "/bin/bash"         23 hours ago        Up 23 hours                             switch1
    

    两个ubuntu 14.04为host1和host2,两个alihasanahmedkhan/docker-sonic-p4为switch1和switch2

  5. 60s左右后,运行./test.sh测试host1和host2的连通性,可得到如下结果:

    xxx@ubuntu:~/Downloads/sonic/p4-test$ ./test.sh 
    [sudo] password for hui: 
    PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=8.54 ms
    
    --- 192.168.2.1 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 8.545/8.545/8.545/0.000 ms
    PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data.
    64 bytes from 192.168.2.2: icmp_seq=1 ttl=62 time=7.17 ms
    64 bytes from 192.168.2.2: icmp_seq=2 ttl=62 time=33.1 ms
    64 bytes from 192.168.2.2: icmp_seq=3 ttl=62 time=33.2 ms
    64 bytes from 192.168.2.2: icmp_seq=4 ttl=62 time=32.5 ms
    64 bytes from 192.168.2.2: icmp_seq=5 ttl=62 time=33.1 ms
    64 bytes from 192.168.2.2: icmp_seq=6 ttl=62 time=35.4 ms
    
  6. 运行./stop.sh可结束仿真

实验详解

配置

switch1和switch2中相关配置都在解压后对应的switch1/etc和switch2/etc目录下,这些配置文件会映射到容器中,容器中各种SONiC进程运行过程中会使用到。

以switch1为例,容器启动过程start.sh中将配置及脚本通过docker数据卷-v参数把挂载到容器中:

sudo docker run --net=none --privileged --entrypoint /bin/bash --name switch1 -it -d -v $PWD/switch1:/sonic alihasanahmedkhan/docker-sonic-p4:latest

主要涉及如下配置:

  1. vlan配置,switch1/etc/config_db/vlan_config.json会生成到config_db.json配置到config_db

    switch1/etc/config_db/vlan_config.json中vlan的配置:

    {
       "VLAN": {
           "Vlan15": {
               "members": [
                   "Ethernet0"
               ], 
               "vlanid": "15"
           }, 
           "Vlan10": {
               "members": [
                   "Ethernet1"
               ], 
               "vlanid": "10"
           }
       },
       "VLAN_MEMBER": {
           "Vlan15|Ethernet0": {
               "tagging_mode": "untagged"
           },
           "Vlan10|Ethernet1": {
               "tagging_mode": "untagged"
           }
       },
       "VLAN_INTERFACE": {
           "Vlan15|10.0.0.0/31": {},
           "Vlan10|192.168.1.1/24": {}
       }
    }
    

    switch1/scripts/startup.sh中vlan配置的拷贝:

    if [ -f /etc/sonic/config_db.json ]; then
       sonic-cfggen -j /etc/sonic/config_db.json -j /sonic/scripts/vlan_config.json --print-data > /tmp/config_db.json
       mv /tmp/config_db.json /etc/sonic/config_db.json
    else
       sonic-cfggen -j /sonic/etc/config_db/vlan_config.json --print-data > /etc/sonic/config_db.json
    fi
    
  2. quagga中bgp配置

    switch1/etc/quagga/bgpd.conf中bgp配置定义本端AS、bgp邻居:

    hostname bgpd
    password zebra
    enable password zebra
    log file /var/log/quagga/bgpd.log
    !
    router bgp 10001
     bgp router-id 192.168.1.1
     network 192.168.1.0 mask 255.255.255.0
     neighbor 10.0.0.1 remote-as 10002
     neighbor 10.0.0.1 timers 1 3
     neighbor 10.0.0.1 send-community
     neighbor 10.0.0.1 allowas-in
     maximum-paths 64
    !
    access-list all permit any
    

    switch1/scripts/startup.sh中quagga配置的拷贝:

    cp -rf /sonic/etc/quagga /etc/
    

运行机制

同样以switch1来分析一下SONiC-P4的运行机制,交换芯片自然是以支持P4的ASIC作为底层模型,交换系统以SONiC架构为网络OS,在SONiC中通常多个容器实现整个网络OS,本实验中一个容器就启动了switch,所以所有的SONiC框架组件都集成在这个容器中运行,这里从容器中进程到P4底层来一步步分析SONiC-P4的运行机制。

先说明一下host1,host2,switch1,switch2都是通过ovs中网桥连接起来的,具体都在start.sh中ovs命令实现:

sudo ovs-vsctl add-br switch1_switch2
sudo ovs-docker add-port switch1_switch2 sw_port0 switch1
sudo ovs-docker add-port switch1_switch2 sw_port0 switch2

sudo ovs-vsctl add-br host1_switch1
sudo ovs-docker add-port host1_switch1 sw_port1 switch1
sudo ovs-docker add-port host1_switch1 eth1 host1

sudo ovs-vsctl add-br host2_switch2
sudo ovs-docker add-port host2_switch2 sw_port1 switch2
sudo ovs-docker add-port host2_switch2 eth1 host2
xxx@ubuntu:~/Downloads/sonic/p4-test$ sudo ovs-vsctl show
[sudo] password for hui: 
71e743dd-96df-4419-bfcf-c2b9fcdc582e
    Bridge "switch1_switch2"
        Port "2510a4f0883c4_l"
            Interface "2510a4f0883c4_l"
        Port "switch1_switch2"
            Interface "switch1_switch2"
                type: internal
        Port "b993f47d0d894_l"
            Interface "b993f47d0d894_l"
    Bridge "host2_switch2"
        Port "host2_switch2"
            Interface "host2_switch2"
                type: internal
        Port "fdfbe4e3a0944_l"
            Interface "fdfbe4e3a0944_l"
        Port "ddccda4c0d1d4_l"
            Interface "ddccda4c0d1d4_l"
    Bridge "host1_switch1"
        Port "b07b5c691a2d4_l"
            Interface "b07b5c691a2d4_l"
        Port "host1_switch1"
            Interface "host1_switch1"
                type: internal
        Port "7c76ce7233714_l"
            Interface "7c76ce7233714_l"
    ovs_version: "2.12.0"

这里以switch1容器中运行的进程分析,switch2同样会存在对应情况。

查询进程列表及端口占用可得到类似如下结果:

root@9acb9dfc3696:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  20248  2984 pts/0    Ss+  Jan27   0:00 /bin/bash
root        59  0.0  0.0  49644 18072 ?        Ss   Jan27   2:06 /usr/bin/python /usr/bin/supervisord
root        76  0.0  0.0 262900  2764 ?        Sl   Jan27   0:01 /usr/sbin/rsyslogd -n
root        82  0.1  0.0  43364  4468 ?        Sl   Jan27   3:25 /usr/bin/redis-server 127.0.0.1:6379       
root       323  1.5  0.6 1010980 203952 ?      Sl   Jan27  44:54 simple_switch -i 0@sw_port0 -i 1@sw_port1 -i 2@sw_port2 -i 3@sw_port3 -i 4@sw_port4 -i 5@sw_port5 -i 6@sw_port6 -i 7@sw_port7 -i 7@sw_port7 -i 8@sw
root       327  0.6  0.0 694416 26336 ?        Sl   Jan27  18:22 simple_switch -i 0@router_port1 -i 250@router_cpu_port --thrift-port 9091 --log-file bm_logs/router_log --log-flush --notifications-addr ipc:///tmp
root       366  0.0  0.0 294868  6320 ?        Sl   Jan27   2:34 /usr/bin/syncd -uN
root       373  0.0  0.0 182064  4888 ?        Sl   Jan27   0:22 /usr/bin/orchagent -d /var/log/swss -b 8192 -m 00:01:04:4c:49:f5
root       382  3.3  0.0  99776  3180 ?        Sl   Jan27  98:16 /usr/bin/portsyncd -p /port_config.ini
root       385  0.0  0.0  99844  3160 ?        Sl   Jan27   0:00 /usr/bin/intfsyncd
root       388  0.0  0.0  99844  3056 ?        Sl   Jan27   0:00 /usr/bin/neighsyncd
root       391  0.0  0.0 108148  3072 ?        Sl   Jan27   0:00 /usr/bin/teamsyncd
root       394  0.0  0.0 101012  3916 ?        Sl   Jan27   0:00 /usr/bin/fpmsyncd
root       397  0.0  0.0  99768  3828 ?        Sl   Jan27   0:09 /usr/bin/intfmgrd
root       400  0.0  0.0  99820  3780 ?        Sl   Jan27   0:09 /usr/bin/vlanmgrd
quagga     443  0.0  0.0  26408  2728 ?        S    Jan27   0:01 /usr/lib/quagga/zebra -A 127.0.0.1
quagga     445  0.0  0.0  31212  4172 ?        S    Jan27   0:55 /usr/lib/quagga/bgpd -A 127.0.0.1 -F
root      9707  0.0  0.0  20248  3212 pts/1    Ss   03:02   0:00 bash
root     11011  0.0  0.0  20052  2844 ?        S    09:08   0:00 bash -c /usr/bin/arp_update; sleep 300
root     11028  0.0  0.0   4240   680 ?        S    09:08   0:00 sleep 300

root@9acb9dfc3696:/# netstat -ntpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:179             0.0.0.0:*               LISTEN      445/bgpd        
tcp        0      0 127.0.0.1:2620          0.0.0.0:*               LISTEN      394/fpmsyncd    
tcp        0      0 127.0.0.1:2601          0.0.0.0:*               LISTEN      443/zebra       
tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN      82/redis-server 127
tcp        0      0 127.0.0.1:2605          0.0.0.0:*               LISTEN      445/bgpd        
tcp6       0      0 :::179                  :::*                    LISTEN      445/bgpd        
tcp6       0      0 :::9090                 :::*                    LISTEN      323/simple_switch
tcp6       0      0 :::9091                 :::*                    LISTEN      327/simple_switch

SONiC-P4容器中supervisord会用于进程管理,SONiC相关进程及P4相关进程都会由supervisord启动,/usr/bin/start.sh中定义了supervisord管理的进程,我们可看到SONiC及P4的相关几个进程。

# P4进程
supervisorctl start bm_bridge
supervisorctl start bm_router

# SONiC进程
echo "Start syncd"
supervisorctl start syncd

echo "Start orchagent"
supervisorctl start orchagent

echo "Start portsyncd"
supervisorctl start portsyncd

echo "Start intfsyncd"
supervisorctl start intfsyncd

echo "Start neighsyncd"
supervisorctl start neighsyncd

echo "Start teamsyncd"
supervisorctl start teamsyncd

echo "Start fpmsyncd"
supervisorctl start fpmsyncd

echo "Start intfmgrd"
supervisorctl start intfmgrd

echo "Start vlanmgrd"
supervisorctl start vlanmgrd

echo "Start zebra"
supervisorctl start zebra

echo "Start bgpd"
supervisorctl start bgpd

if [ -f /etc/swss/config.d/default_config.json ]; then
    swssconfig /etc/swss/config.d/default_config.json
fi

# Start arp_update when VLAN exists
VLAN=`sonic-cfggen -d -v 'VLAN.keys() | join(" ") if VLAN'`
if [ "$VLAN" != "" ]; then
    echo "Start arp_update"
    supervisorctl start arp_update

根据SONiC架构我们对应到SONiC关键进程,如上获取的进程列表我们可以发现多个容器进程合并到同一容器里运行了:

redis-server # database container的redis数据库进程
orchagent, portsyncd, intfsyncd, neighsyncd, intfmgrd, vlanmgrd, syncd # swss container对应的相关进程
teamsyncd # teamd container的teamsyncd进 #程
fpmsyncd, zebra, bgpd # bgp container对应的相关进程

sonic-arth

P4进程:

simple_switch # P4模型虚机交换机进程,有两个进程,一个作为router,一个作为bridge

网络

image-20220129112900663

实验中划分为两个AS,使用bgp路由协议,SONiC网络操作系统作为上层实现,底层P4模型模拟ASIC交换芯片,下面从这两个维度分析一下该实验网络,如下都以switch1容器举例,switch2同样对存在对应的系统及配置。

上层SONiC网络操作系统

switch bgp邻居状态:

# switch1
root@9acb9dfc3696:/# vtysh -c "show ip bgp sum"
BGP router identifier 192.168.1.1, local AS number 10001
RIB entries 3, using 336 bytes of memory
Peers 1, using 4656 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.1        4 10002  234804  234808        0    0    0 2d17h16m        1

Total number of neighbors 1

# switch2
root@4fe94a1879fd:/# vtysh -c "show ip bgp sum"
BGP router identifier 192.168.2.1, local AS number 10002
RIB entries 3, using 336 bytes of memory
Peers 1, using 4656 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.0        4 10001  235030  235031        0    0    0 2d17h20m        1

Total number of neighbors 1

switch生成路由如下:

# switch1
root@9acb9dfc3696:/# vtysh -c "show ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, A - Babel,
       > - selected route, * - FIB route

C>* 10.0.0.0/31 is directly connected, Vlan15
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.1.0/24 is directly connected, Vlan10
B>* 192.168.2.0/24 [20/0] via 10.0.0.1, Vlan15, 2d17h27m

root@9acb9dfc3696:/# ip route
10.0.0.0/31 dev Vlan15 proto kernel scope link src 10.0.0.0 
192.168.1.0/24 dev Vlan10 proto kernel scope link src 192.168.1.1 
192.168.2.0/24 via 10.0.0.1 dev Vlan15 proto zebra


# switch2
root@4fe94a1879fd:/# vtysh -c "show ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, A - Babel,
       > - selected route, * - FIB route

C>* 10.0.0.0/31 is directly connected, Vlan14
C>* 127.0.0.0/8 is directly connected, lo
B>* 192.168.1.0/24 [20/0] via 10.0.0.0, Vlan14, 2d17h26m
C>* 192.168.2.0/24 is directly connected, Vlan9

root@4fe94a1879fd:/# ip route
10.0.0.0/31 dev Vlan14 proto kernel scope link src 10.0.0.1 
192.168.1.0/24 via 10.0.0.0 dev Vlan14 proto zebra 
192.168.2.0/24 dev Vlan9 proto kernel scope link src 192.168.2.1

switch vlan:

# switch1
root@9acb9dfc3696:/# show vlan config
Name      VID  Member     Mode
------  -----  ---------  --------
Vlan10     10  Ethernet1  untagged
Vlan15     15  Ethernet0  untagged

# switch2
root@4fe94a1879fd:/# show vlan config
Name      VID  Member     Mode
------  -----  ---------  --------
Vlan9       9  Ethernet1  untagged
Vlan14     14  Ethernet0  untagged

switch vlan interface:

# switch1
root@9acb9dfc3696:/# cat /proc/net/vlan/config
VLAN Dev name    | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
Vlan10         | 10  | Bridge
Vlan15         | 15  | Bridge
root@9acb9dfc3696:/# ifconfig Vlan10
Vlan10    Link encap:Ethernet  HWaddr 00:01:04:4c:49:f5  
          inet addr:192.168.1.1  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:780 errors:0 dropped:0 overruns:0 frame:0
          TX packets:780 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:21840 (21.3 KiB)  TX bytes:32760 (31.9 KiB)

root@9acb9dfc3696:/# ifconfig Vlan15
Vlan15    Link encap:Ethernet  HWaddr 00:01:04:4c:49:f5  
          inet addr:10.0.0.0  Bcast:0.0.0.0  Mask:255.255.255.254
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:471533 errors:0 dropped:0 overruns:0 frame:0
          TX packets:471529 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:35490404 (33.8 MiB)  TX bytes:42091665 (40.1 MiB)

# switch2
root@4fe94a1879fd:/# cat /proc/net/vlan/config 
VLAN Dev name    | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
Vlan14         | 14  | Bridge
Vlan9          | 9  | Bridge
root@4fe94a1879fd:/# ifconfig Vlan9
Vlan9     Link encap:Ethernet  HWaddr 00:01:04:4c:49:f6  
          inet addr:192.168.2.1  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:875 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2443 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:29484 (28.7 KiB)  TX bytes:107590 (105.0 KiB)

root@4fe94a1879fd:/# ifconfig Vlan14
Vlan14    Link encap:Ethernet  HWaddr 00:01:04:4c:49:f6  
          inet addr:10.0.0.1  Bcast:0.0.0.0  Mask:255.255.255.254
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:471765 errors:0 dropped:0 overruns:0 frame:0
          TX packets:471769 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:35508077 (33.8 MiB)  TX bytes:42112988 (40.1 MiB)

SONiC Redis数据库中数据:

root@9acb9dfc3696:/# redis-cli                           
127.0.0.1:6379> keys VLAN*
1) "VLAN_MEMBER_TABLE:Vlan10:Ethernet1"
2) "VLAN_TABLE:Vlan15"
3) "VLAN_TABLE:Vlan10"
4) "VLAN_MEMBER_TABLE:Vlan15:Ethernet0"
127.0.0.1:6379> keys ROUTE_TABLE*
1) "ROUTE_TABLE:192.168.1.0/24"
2) "ROUTE_TABLE:192.168.2.0/24"
3) "ROUTE_TABLE:10.0.0.0/31"
4) "ROUTE_TABLE:127.0.0.0/8"
127.0.0.1:6379> hgetall "VLAN_TABLE:Vlan10"
1) "admin_status"
2) "up"
127.0.0.1:6379> hgetall "VLAN_MEMBER_TABLE:Vlan10:Ethernet1"
1) "tagging_mode"
2) "untagged"
127.0.0.1:6379> hgetall "ROUTE_TABLE:192.168.1.0/24"
1) "nexthop"
2) ""
3) "ifname"
4) "Vlan10"
127.0.0.1:6379> hgetall "ROUTE_TABLE:192.168.2.0/24"
1) "nexthop"
2) "10.0.0.1"
3) "ifname"
4) "Vlan15"
127.0.0.1:6379> hgetall "ROUTE_TABLE:10.0.0.0/31"
1) "nexthop"
2) ""
3) "ifname"
4) "Vlan15"
127.0.0.1:6379> 

底层P4模型交换芯片

运行机制中我们已经分析P4中分为两个进程,一个作为router,一个作为bridge,我们这里可以通过P4模型Thrift接口提供的Cli来查看一下交换芯片中转发表项进行分析。

通过/etc/supervisor/conf.d/supervisord.conf中bridge和router的Thrift接口监听端口分别为默认9090和设定的9091

[program:bm_router]
command=ip netns exec sw_net simple_switch -i 0@router_port1 -i 250@router_cpu_port --thrift-port 9091 --log-file bm_logs/router_log --log-flush --notifications-addr ipc:///tmp/bmv2-router-notifications.ipc /usr/share/p4-sai-bm/sai_router.json
priority=3
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

[program:bm_bridge]
command=ip netns exec sw_net simple_switch -i 0@sw_port0 -i 1@sw_port1 -i 2@sw_port2 -i 3@sw_port3 -i 4@sw_port4 -i 5@sw_port5 -i 6@sw_port6 -i 7@sw_port7 -i 7@sw_port7 -i 8@sw_port8 -i 9@sw_port9 -i 10@sw_port10 -i 11@sw_port11 -i 12@sw_port12 -i 13@sw_port13 -i 14@sw_port14 -i 15@sw_port15 -i 16@sw_port16 -i 17@sw_port17 -i 18@sw_port18 -i 19@sw_port19 -i 20@sw_port20 -i 21@sw_port21 -i 22@sw_port22 -i 23@sw_port23 -i 24@sw_port24 -i 25@sw_port25 -i 26@sw_port26 -i 27@sw_port27 -i 28@sw_port28 -i 29@sw_port29 -i 30@sw_port30 -i 31@sw_port31 -i 250@cpu_port -i 251@router_port0 --log-file bm_logs/bridge_log --log-flush /usr/share/p4-sai-bm/sai_bridge.json &
priority=3
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

通过Cli连接9090端口查看bridge中vlan相关转发表项:

root@9acb9dfc3696:/# simple_switch_CLI --thrift-port 9090
Obtaining JSON from switch...
Done
Control utility for runtime P4 table manipulation
RuntimeCmd: show_tables
table_bridge_id_1d             [implementation=None, mk=ingress_metadata.bridge_port(exact, 8)]
table_bridge_id_1q             [implementation=None, mk=ingress_metadata.vid(exact, 12)]
table_bridge_loopback_filter   [implementation=None, mk=egress_metadata.bridge_port(exact, 8)]
table_broadcast                [implementation=None, mk=ingress_metadata.bridge_id(exact, 12)]
table_cpu_forward              [implementation=None, mk=cpu_header_valid(valid, 1)]
table_drop_tagged_internal     [implementation=None, mk=ingress_metadata.drop_tagged(exact, 1)]
table_drop_untagged_internal   [implementation=None, mk=ingress_metadata.drop_untagged(exact, 1)]
table_egress_br_port_to_if     [implementation=None, mk=egress_metadata.bridge_port(exact, 8)]
table_egress_clone_internal    [implementation=None, mk=standard_metadata.instance_type(exact, 32)]
table_egress_lag               [implementation=None, mk=egress_metadata.out_if(exact, 8),   egress_metadata.hash_val(exact, 6)]
table_egress_set_vlan          [implementation=None, mk=egress_metadata.bridge_port(exact, 8)]
table_egress_vbridge_STP       [implementation=None, mk=egress_metadata.bridge_port(exact, 8)]
table_egress_vlan_filtering    [implementation=None, mk=egress_metadata.bridge_port(exact, 8),  ingress_metadata.vid(exact, 12)]
table_egress_vlan_tag          [implementation=None, mk=egress_metadata.out_if(exact, 8),   ingress_metadata.vid(exact, 12),    vlan_valid(valid, 1)]
table_egress_xSTP              [implementation=None, mk=egress_metadata.bridge_port(exact, 8),  ingress_metadata.stp_id(exact, 3)]
table_fdb                      [implementation=None, mk=ethernet.dstAddr(exact, 48),    ingress_metadata.bridge_id(exact, 12)]
table_flood                    [implementation=None, mk=ingress_metadata.bridge_id(exact, 12)]
table_hostif_vlan_tag          [implementation=None, mk=cpu_header.dst(exact, 16),  vlan_valid(valid, 1)]
table_ingress_lag              [implementation=None, mk=standard_metadata.ingress_port(exact, 9)]
table_ingress_vlan_filtering   [implementation=None, mk=ingress_metadata.bridge_port(exact, 8), ingress_metadata.vid(exact, 12)]
table_l2_trap                  [implementation=None, mk=ethernet.dstAddr(exact, 48)]
table_l3_interface             [implementation=None, mk=ethernet.dstAddr(exact, 48),    ingress_metadata.bridge_id(exact, 12)]
table_lag_hash                 [implementation=None, mk=egress_metadata.out_if(exact, 8)]
table_learn_fdb                [implementation=None, mk=ethernet.srcAddr(exact, 48),    ingress_metadata.bridge_id(exact, 12)]
table_mc_fdb                   [implementation=None, mk=ethernet.dstAddr(exact, 48),    ingress_metadata.bridge_id(exact, 12)]
table_mc_l2_sg_g               [implementation=None, mk=ingress_metadata.bridge_id(exact, 12),  ipv4.srcAddr(exact, 32),    ipv4.dstAddr(exact, 32)]
table_mc_lookup_mode           [implementation=None, mk=ingress_metadata.vid(exact, 12)]
table_port_configurations      [implementation=None, mk=ingress_metadata.l2_if(exact, 8)]
table_port_ingress_interface_type [implementation=None, mk=ingress_metadata.l2_if(exact, 8)]
table_port_set_packet_vid_internal [implementation=None, mk=ingress_metadata.is_tagged(exact, 1)]
table_subport_ingress_interface_type [implementation=None, mk=ingress_metadata.l2_if(exact, 8), ingress_metadata.vid(exact, 12)]
table_trap_id                  [implementation=None, mk=ingress_metadata.trap_id(exact, 11)]
table_unknown_multicast_ipv4   [implementation=None, mk=ingress_metadata.bridge_id(exact, 12)]
table_unknown_multicast_nonip  [implementation=None, mk=ingress_metadata.bridge_id(exact, 12)]
table_vbridge_STP              [implementation=None, mk=ingress_metadata.bridge_port(exact, 8)]
table_xSTP                     [implementation=None, mk=ingress_metadata.bridge_port(exact, 8), ingress_metadata.stp_id(exact, 3)]
table_xSTP_instance            [implementation=None, mk=ingress_metadata.vid(exact, 12)]
RuntimeCmd: table_dump table_egress_vlan_tag
==========
TABLE ENTRIES
**********
Dumping entry 0x1000000
Match key:
* egress_metadata.out_if: EXACT     01
* ingress_metadata.vid  : EXACT     000a
* vlan_valid            : VALID     01
Action entry: action_forward_vlan_untag - 
**********
Dumping entry 0x1000001
Match key:
* egress_metadata.out_if: EXACT     00
* ingress_metadata.vid  : EXACT     000f
* vlan_valid            : VALID     01
Action entry: action_forward_vlan_untag - 
**********
Dumping entry 0x1000002
Match key:
* egress_metadata.out_if: EXACT     fb
* ingress_metadata.vid  : EXACT     000a
* vlan_valid            : VALID     00
Action entry: action_forward_vlan_tag - 00, 00, 0a
**********
Dumping entry 0x1000003
Match key:
* egress_metadata.out_if: EXACT     fb
* ingress_metadata.vid  : EXACT     000f
* vlan_valid            : VALID     00
Action entry: action_forward_vlan_tag - 00, 00, 0f
**********
Dumping entry 0x20
Match key:
* egress_metadata.out_if: EXACT     fb
* ingress_metadata.vid  : EXACT     0001
* vlan_valid            : VALID     00
Action entry: action_forward_vlan_tag - 00, 00, 01
==========
Dumping default entry
EMPTY
==========

通过Cli连接9091端口查看router中bgp生成的路由转发表项:

root@9acb9dfc3696:/# simple_switch_CLI --thrift-port 9091
Obtaining JSON from switch...
Done
Control utility for runtime P4 table manipulation
RuntimeCmd: show_tables
table_cpu_forward              [implementation=None, mk=cpu_header_valid(valid, 1)]
table_egress_L3_if             [implementation=None, mk=router_metadata.egress_rif(exact, 8)]
table_egress_clone_internal    [implementation=None, mk=standard_metadata.instance_type(exact, 32)]
table_ingress_l3_if            [implementation=None, mk=standard_metadata.ingress_port(exact, 9),   vlan.vid(exact, 12)]
table_ingress_vrf              [implementation=None, mk=router_metadata.ingress_rif(exact, 8)]
table_ip2me_trap               [implementation=None, mk=l4_metadata.srcPort(exact, 16), l4_metadata.dstPort(exact, 16), ipv4.protocol(exact, 8)]
table_l3_trap_id               [implementation=None, mk=ingress_metadata.trap_id(exact, 11)]
table_neighbor                 [implementation=None, mk=router_metadata.egress_rif(exact, 8),   router_metadata.next_hop_dst_ip(exact, 32)]
table_next_hop                 [implementation=None, mk=router_metadata.next_hop_id(exact, 8)]
table_next_hop_group           [implementation=None, mk=router_metadata.next_hop_group_id(exact, 3)]
table_pre_l3_trap              [implementation=None, mk=vlan.etherType(ternary, 16),    ipv4.dstAddr(lpm, 32),  arp_ipv4.opcode(ternary, 16)]
table_router                   [implementation=None, mk=router_metadata.ingress_vrf(exact, 8),  ipv4.dstAddr(lpm, 32)]
table_ttl                      [implementation=None, mk=ipv4.ttl(exact, 8)]
RuntimeCmd: table_dump table_router
==========
TABLE ENTRIES
**********
Dumping entry 0x0
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       00000000/0
Action entry: _drop - 
**********
Dumping entry 0x1
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       c0a80100/24
Action entry: action_set_erif_set_nh_dstip_from_pkt - 02
**********
Dumping entry 0x2
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       c0a80101/32
Action entry: action_set_ip2me - 
**********
Dumping entry 0x3
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       0a000000/31
Action entry: action_set_erif_set_nh_dstip_from_pkt - 03
**********
Dumping entry 0x4
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       0a000000/32
Action entry: action_set_ip2me - 
**********
Dumping entry 0x5
Match key:
* router_metadata.ingress_vrf: EXACT     00
* ipv4.dstAddr               : LPM       c0a80200/24
Action entry: action_set_nhop_id - 00
==========
Dumping default entry
Action entry: _drop - 
==========

通过bridge和router中转发表项我们可以查到前面实验中针对vlan的配置及bgp设定生成的路由表项已经有下发对应的表项到P4模型中的ASIC芯片,具体router及bridge的P4数据面实现可通过如下P4代码实现详细了解。

https://github.com/Mellanox/SAI-P4-BM/blob/master/p4-switch/sai-p4-bm/p4src

Reference

https://github.com/Azure/SONiC/wiki/SONiC-P4-Software-Switch

https://github.com/Azure/SONiC/wiki/Architecture?spm=a2c6h.12873639.0.0.980036b0oeCGJJ

https://github.com/Mellanox/SAI-P4-BM

https://hub.docker.com/r/alihasanahmedkhan/docker-sonic-p4

https://zhuanlan.zhihu.com/p/99887221

4 Comments

  1. 谭立状 10月 5, 2022

    非常不错

  2. Zhang 10月 8, 2022

    您好,我在进行实验时进行到运行./start.sh时,出现connect: Network is unreachable的提示,请问这种情况怎么解决,如能帮我解答,不胜感激!!!

    • Zhang 10月 8, 2022

      不好意思,我的表达有误,是运行./test.sh时出现错误,更正一下。

      • martrix 12月 7, 2022

        test.sh就是进入host容器执行ping操作,具体看容器是否正常运行,并且手动进入容器内部ping一下试试。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注