InfiniBand
- Infiniband
- 關於infiniband 跟一般網卡的差別 - iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天
- NVIDIA 發布 400G InfiniBand 網路卡,前身為 Mellanox | XFastest News
- 家用万兆网络指南 1 - 不如先来个最简单的100G网络 - 知乎
- users@lists.openhpc.community | Infiniband over slurm
- Infiniband - HackMD
- InfiniBand Command Examples
- openibd and opensm
- Red Hat Enterprise Linux 8 Configuring InfiniBand and RDMA networks
- Commands for InfiniBand Diagnostics - UFM-SDN Appliance UM v4.4.0 - NVIDIA Networking Docs
- Diagnostic Tools - UFM-SDN Appliance UM v4.4.0 - NVIDIA Networking Docs
- InfiniBand Fabric Utilities - MLNX_OFED v5.0-2.1.8.0 - NVIDIA Networking Docs
- GitHub - linux-rdma/perftest: Infiniband Verbs Performance Tests
- ib_send_bw (perftest) get ERROR: Couldn't allocate MR failed to create mr · Issue #23 · zrlio/softiwarp · GitHub
- Driver
- Firmware
- performance
- switch
InfiniBand refers to two distinct things - The physical link-layer protocol for InfiniBand networks - The InfiniBand Verbs API, an implementation of the remote direct memory access (RDMA) technology - [Day25] Device Plugin - RDMA
RDMA provides access between the main memory of two computers without involving an operating system, cache, or storage. Using RDMA, data transfers with high-throughput, low-latency, and low CPU utilization.
Troubleshoot
- ibqueryerrors(8) - Linux man page
- perfquery(8): query InfiniBand port counters - Linux man page
- Overview of Error Counters - OpenFabrics Alliance Wiki
To troubleshoot InfiniBand network issues using Mellanox, you can follow these steps and use specific commands:
- Verify the Mellanox InfiniBand Network Configuration:
- Run the command
ibstatus
to check the overall status of the InfiniBand fabric and the connected nodes. - Use
ibhosts
to view the list of hosts and their corresponding GUIDs (Globally Unique Identifier) in the InfiniBand fabric. -
Run
ibnetdiscover
to discover the topology of the InfiniBand fabric and identify any connectivity issues. -
Check the Link Status and Performance:
- Use
ibstatus
to check the link status of the InfiniBand adapters. - Run
ibcheckstate -l
to check the state of the links in the fabric. - Use
iblinkinfo
to gather information about the state and quality of the links in the InfiniBand fabric. -
Run
ibdiagnet
to perform a comprehensive diagnostic test on the InfiniBand fabric, including link and cable testing. -
Analyze Performance and Latency:
- Use
ibperf
to measure the throughput and latency of the InfiniBand fabric. - Run
ib_read_bw
andib_write_bw
to measure the read and write bandwidth respectively. - Use
ib_read_lat
andib_write_lat
to measure the read and write latency respectively. -
Run
perfquery
to obtain performance counters from the InfiniBand adapters. -
Diagnose Errors and Issues:
- Use
ibcheckerrors
to check for any InfiniBand-specific errors and correctable errors. - Run
ibdiagnet
to perform a comprehensive diagnostic test and identify potential issues within the fabric. - Use
ibstat
andiblinkinfo
to check for any link errors or problems with the InfiniBand adapters.
ibstat
sudo ibnodes
sudo ibswitches
sudo ibnetdiscover
sudo iblinkinfo
sudo perfquery -a 14
sudo ibqueryerrors --report-port --data
sudo ibqueryerrors --counters
sudo sminfo
ibv_devices
ibv_devinfo -d mlx5_0
ibstatus
ibstat
ibstat mlx5_0
sudo ibqueryerrors
sudo ibqueryerrors --counters
sudo ibnetdiscover
sudo ibnodes
sudo perfquery -a 14
sudo smpquery nodeinfo 14
sudo smpquery portinfo 14 1
ibportstate
sudo ibswitches
command not found
test bandwidth 第一個是server 第二個是client, 在client指定server
check current NIC driver version and firmware version ``` bash= interface_name=ib0 driver_version=$(ethtool -i ${interface_name} | grep version | head -n 1 | awk '{print $2}') firmware_version=$(ethtool -i ${interface_name} | grep firmware-version | awk '{print $2}')
install the nic driver
``` bash
# install the nic driver
os=LINUX
vendor=MLNX_OFED
install_driver_version=5.4-3.6.8.1
install_driver_version_full=${vendor}_${os}-${install_driver_version}
distribution_name=$(cat /etc/os-release | grep ID= | head -n 1 | awk -F = '{print $2}' | tr -d '"')
distribution_version=$(cat /etc/os-release | grep VERSION_ID= | awk -F = '{print $2}' | tr -d '"')
distribution=${distribution_name}${distribution_version}
platform=$(uname -m)
wget https://content.mellanox.com/ofed/${vendor}-${install_driver_version}/${install_driver_version_full}-${distribution}-${platform}.tgz
tar zxvf ${install_driver_version_full}-${distribution}-${platform}.tgz && cd ${install_driver_version_full}-${distribution}-${platform}/
sudo ./mlnxofedinstall --force --with-nfsrdma --add-kernel-support
# To load the new driver, run:
sudo /etc/init.d/openibd restart
install the nic firmware
# install the nic firmware
pci_bus_id=81:00.0
install_firmware_version=16_25_1020
ordering_part_numbers=MCX516A-CCA
firmware_name=fw-ConnectX5-rel-${install_firmware_version}-${ordering_part_numbers}_Ax-UEFI-14.18.19-FlexBoot-3.5.701.bin
wget -Nq -O ${firmware_name}.zip http://www.mellanox.com/downloads/firmware/${firmware_name}.zip
unzip -o ${firmware_name}.zip
sudo mstflint -d ${pci_bus_id} -i ${firmware_name} -y b
# reset fw
sudo mstfwreset -d ${pci_bus_id} -y r
MTU Here are the instructions of changing the mtu for the InfiniBand network. Document: https://linux.die.net/man/8/opensm, https://docs.nvidia.com/networking/display/MLNXOFEDv461000/OpenSM
Modified the OpenSM partitions configuration file. This operation only needs to be done on the node that has opensm
.
Changing the value of mtu from 4 to 5.
$ vim /etc/rdma/partitions.conf
# mtu =
# 1 = 256
# 2 = 512
# 3 = 1024
# 4 = 2048
# 5 = 4096
#
# rate =
# 2 = 2.5 GBit/s (SDR 1x)
# 3 = 10 GBit/s (SDR 4x/QDR 1x)
# 4 = 30 GBit/s (SDR 12x)
# 5 = 5 GBit/s (DDR 1x)
# 6 = 20 GBit/s (DDR 4x)
# 7 = 40 GBit/s (QDR 4x)
# 8 = 60 GBit/s (DDR 12x)
# 9 = 80 GBit/s (QDR 8x)
# 10 = 120 GBit/s (QDR 12x)
# If ExtendedLinkSpeeds are supported, then these rate values are valid too
# 11 = 14 GBit/s (FDR 1x)
# 12 = 56 GBit/s (FDR 4x)
# 13 = 112 GBit/s (FDR 8x)
# 14 = 168 GBit/s (FDR 12x)
# 15 = 25 GBit/s (EDR 1x)
# 16 = 100 GBit/s (EDR 4x)
# 17 = 200 GBit/s (EDR 8x)
# 18 = 300 GBit/s (EDR 12x)
Default=0x7fff, rate=3, mtu=5, scope=2, defmember=full:
ALL, ALL_SWITCHES=full;
Default=0x7fff, ipoib, rate=3, mtu=5, scope=2:
# restart the opensm
$ sudo systemctl restart opensm
# ensure the opensm is working
$ sudo systemctl status opensm
# ensure the mtu becomes 4096
$ ip a