Understanding IP over InfiniBand (IPoIB) and Configuring Connection Modes

Infiniband

Cadet
Joined
Jun 8, 2023
Messages
1
InfiniBand (IB) communication does not use IP by default. However, IP over InfiniBand (IPoIB) provides an IP network emulation layer over InfiniBand Remote Direct Memory Access (RDMA) networks. This allows existing, unmodified applications to transmit data over InfiniBand networks, but with lower performance than when using RDMA natively.
NVLink difference.jpg


Internet Wide Area RDMA Protocol (iWARP) and RDMA over Converged Ethernet (RoCE) networks are based on IP. Therefore, IPoIB devices cannot be created on iWARP or RoCE devices. Starting from ConnectX-4 and higher versions, Mellanox devices use Enhanced IPoIB mode (data only) by default. These devices do not support connected mode.

In IPoIB communication mode, IPoIB devices are configured in Datagram or Connected mode. The difference is in what type of queue the IPoIB layer attempts to open in the machine at the other end of the communication:

In Datagram mode, an unreliable, disconnected queue pair is opened by the system. This mode does not support packets larger than the maximum transmission unit (MTU) of the InfiniBand link layer. The IPoIB layer adds a 4-byte IPoIB header over the transmitted IP packet. Therefore, the IPoIB MTU needs to be 4 bytes smaller than the InfiniBand link layer MTU. As 2048 is a common InfiniBand link layer MTU, the common IPoIB device MTU in Datagram mode is 2044.

In Connected mode, a reliable, connected queue pair is opened by the system. This mode allows messages larger than the InfiniBand link layer MTU, and the host adapter handles packet segmentation and retransmission. Therefore, there is no limit to the size of IPoIB messages sent by InfiniBand adapters in Connected mode. However, IP packets are limited by the size field and TCP/IP header. Therefore, the maximum IPoIB MTU in Connected mode is 65520 bytes.

Connected mode has higher performance but consumes more kernel memory.

If the system is configured to use Connected mode, it still sends multicast traffic in Datagram mode, as InfiniBand switches and fiber cannot pass multicast traffic in Connected mode. Additionally, the system defaults to Datagram mode when communicating with any host not configured in Connected mode.

When running an application that sends multicast data to the interface with the maximum MTU, you must configure the interface in Datagram mode or configure the application to cap packet size at the maximum data packet size.
 
Last edited by a moderator:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
While there's some moderately interesting information there, it's unsolicited and contains a link to a site that's likely to have some kind of sales-pitch.

With moderator hat on, we'll be watching this user carefully for their next posts.

Click links with caution.
 
Top