首页 > 技术知识 > 正文

1. 背景

Xavier的PCIe插槽安装了一个10Gb以太网卡。 当运行速度超过1Gb/秒时,大量数据包被丢弃。 Netstat确认接口丢弃的数据包。 系统似乎已经将卡标识为10Gb,但在1Gb时出现了瓶颈。

SDK版本:Jetpack 4.1 软件: 无线电制造商提供的基准I/O例程,使用的是Intel X520-DA2卡;两个GbE接口的ethtool输出显示固件版本为0x61c10001,驱动版本为ixgbe 4.6.4

2. 节点调试

ifconfig output:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 128.112.3.3 netmask 255.255.0.0 broadcast 128.112.255.255 inet6 fe80::f97f:4b79:ec64:cea7 prefixlen 64 scopeid 0x20 ether 00:04:4b:cb:9b:a5 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 59 bytes 6200 (6.2 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 40 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 192.168.30.1 netmask 255.255.255.0 broadcast 192.168.30.255 inet6 fe80::1e3:fc5f:89f3:c358 prefixlen 64 scopeid 0x20 ether 90:e2:ba:f2:1c:18 txqueuelen 1000 (Ethernet) RX packets 13 bytes 780 (780.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 59 bytes 6266 (6.2 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 192.168.40.1 netmask 255.255.255.0 broadcast 192.168.40.255 inet6 fe80::79dd:b1f:cdcb:4036 prefixlen 64 scopeid 0x20 ether 90:e2:ba:f2:1c:19 txqueuelen 1000 (Ethernet) RX packets 13 bytes 780 (780.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 56 bytes 6032 (6.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 l4tbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 192.168.55.1 netmask 255.255.255.0 broadcast 192.168.55.255 inet6 fe80::1 prefixlen 128 scopeid 0x20 inet6 fe80::5ce0:5eff:fe90:c37 prefixlen 64 scopeid 0x20 ether 52:ea:3b:35:a5:d6 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 6 bytes 534 (534.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 loop txqueuelen 1 (Local Loopback) RX packets 627 bytes 39815 (39.8 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 627 bytes 39815 (39.8 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 rndis0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether a2:ea:33:5c:f6:29 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 usb0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 52:ea:3b:35:a5:d6 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
<

route output:

Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 128.112.0.0 0.0.0.0 255.255.0.0 U 102 0 0 eth0 link-local 0.0.0.0 255.255.0.0 U 1000 0 0 l4tbr0 192.168.30.0 0.0.0.0 255.255.255.0 U 100 0 0 eth1 192.168.40.0 0.0.0.0 255.255.255.0 U 101 0 0 eth2 192.168.55.0 0.0.0.0 255.255.255.0 U 0 0 0 l4tbr0

ethtool eth1 output:

Settings for eth1: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: Symmetric Supports auto-negotiation: No Advertised link modes: 10000baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Speed: 10000Mb/s Duplex: Full Port: Direct Attach Copper PHYAD: 0 Transceiver: external Auto-negotiation: off Current message level: 0x00000007 (7) drv probe link Link detected: yes 3. 停止USB设备模式的网桥模式

可以消除“192.168.55.0”桥来进行简化。 这个桥实际上是USB小工具模式示例代码的一部分。 如果查看“/opt/nvidia/l4t-usb-device-mode/”,可看到如何使用USB端口模拟大容量存储和以太网卡。这可以被禁用而不会造成伤害,并且应该在大多数系统中禁用。 这个命令会显示在引导时激活这个的两个文件:

ls -l `find /etc/systemd -type l` | grep opt

删除这两个符号链接:

sudo rm /etc/systemd/system/multi-user.target.wants/nv-l4t-usb-device-mode.service sudo rm /etc/systemd/system/nv-l4t-usb-device-mode.service

然后重新启动,USB设备模式的网桥将停止。

在移除USB小工具模式后,上述性能,会运行更多的时间和更多的流量,可再次提交eth1和eth2的ifconfig输出,更多的流量会是一个更好的指示。 尝试使用网络交换机而不是直接连接进行相同的测试。

按照上述停止USB设备模式的网桥模式后,仍然没有效果:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 128.112.3.3 netmask 255.255.0.0 broadcast 128.112.255.255 inet6 fe80::f97f:4b79:ec64:cea7 prefixlen 64 scopeid 0x20 ether 00:04:4b:cb:9b:a5 txqueuelen 1000 (Ethernet) RX packets 955 bytes 81191 (81.1 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 835 bytes 79910 (79.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 40 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 192.168.30.1 netmask 255.255.255.0 broadcast 192.168.30.255 inet6 fe80::1e3:fc5f:89f3:c358 prefixlen 64 scopeid 0x20 ether 90:e2:ba:f2:1c:18 txqueuelen 1000 (Ethernet) RX packets 3189675 bytes 25139723250 (25.1 GB) RX errors 0 dropped 139847 overruns 0 frame 0 TX packets 63443 bytes 5196484 (5.1 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 192.168.40.1 netmask 255.255.255.0 broadcast 192.168.40.255 inet6 fe80::79dd:b1f:cdcb:4036 prefixlen 64 scopeid 0x20 ether 90:e2:ba:f2:1c:19 txqueuelen 1000 (Ethernet) RX packets 3286221 bytes 25114496876 (25.1 GB) RX errors 0 dropped 143452 overruns 0 frame 0 TX packets 163941 bytes 11025129 (11.0 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 loop txqueuelen 1 (Local Loopback) RX packets 10208 bytes 627832 (627.8 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 10208 bytes 627832 (627.8 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
<
4. 分析

eth0:正常运行。 eth1: RX丢失大量数据包。可能是以太网问题或终端用户问题。 可能不是硬件问题,没有溢出、帧或碰撞,我怀疑这是冲突; 这绝对不是对另一个网络设备干扰的不良反应。 eth2:与eth1相同。

5. 查看10GbE 接口 中断

ifconfig eth0

device interrupt 40

另一方面,我也看到了两种描述:

42: … 2490000.ether_qos.rx0 43: … 2490000.ether_qos.tx0

ifconfig”中列出的可能是特定于该硬件的,并且tx/rx可能与数据源/同步有关。 2490000将是控制器硬件的地址

sudo find /sys -name *2490000*

找到与控制器相关的东西。找到了ether_qos:

/sys/kernel/iommu_groups/4/devices/2490000.ether_qos /sys/bus/platform/devices/2490000.ether_qos

sudo find /sys -name eth0 得到进一步的确认:

/sys/devices/2490000.ether_qos/net/eth0 /sys/class/net/eth0 sudo -s cd /sys/class/net ls -l eth1 # The name of the file pointed to should have some sort of identifier for the driver, # e.g., the case of my eth0 I see the controller address concatenated with “ether_qos”. # So I look for “ether_qos” in “/proc/interrupts”: egrep ether_qos /proc/interupts 40: 8696 0 0 0 0 0 0 0 GICv2 226 Level ether_qos.common_irq 42: 3737 0 0 0 0 0 0 0 GICv2 222 Level 2490000.ether_qos.rx0 43: 2114 0 0 0 0 0 0 0 GICv2 218 Level 2490000.ether_qos.tx0

通过上述查询, 发现中断也是正常的

6. 尝试更新ubuntu系统

通过更新操作系统到Ubuntu v.18; 发现上述问题都被解决了; 暂时还不太清楚什么原因

猜你喜欢