13.3. DPDK NetDevice¶
Data Plane Development Kit (DPDK) is a library hosted by The Linux Foundation to accelerate packet processing workloads (https://www.dpdk.org/).
The DpdkNetDevice
class provides the implementation of a network device which uses DPDK’s fast packet processing abilities and bypasses the kernel. This class is included in the src/fd-net-device model
. The DpdkNetDevice
class inherits the FdNetDevice
class and overrides the functions which are required by ns-3 to interact with DPDK environment.
The DpdkNetDevice
for ns-3 [Patel2019] was developed by Harsh Patel,
Hrishikesh Hiraskar and Mohit P. Tahiliani. They were supported by Intel
Technology India Pvt. Ltd., Bangalore for this work.
- Patel2019
Harsh Patel, Hrishikesh Hiraskar, Mohit P. Tahiliani, “Extending Network Emulation Support in ns-3 using DPDK”, Proceedings of the 2019 Workshop on ns-3, ACM, Pages 17-24, (https://dl.acm.org/doi/abs/10.1145/3321349.3321358)
13.3.1. Model Description¶
DpdkNetDevice
is a network device which provides network emulation capabilities i.e. to allow simulated nodes to interact with real hosts and vice versa. The main feature of the DpdkNetDevice
is that is uses the Environment Abstraction Layer (EAL) provided by DPDK to perform fast packet processing. EAL hides the device specific attributes from the applications and provides an interface via which the applications can interact directly with the Network Interface Card (NIC). This allows ns-3 to send/receive packets directly to/from the NIC without the kernel involvement.
13.3.1.1. Design¶
DpdkNetDevice
is designed to act as an interface between ns-3 and DPDK environment. There are 3 main phases in the life cycle of DpdkNetDevice
:
Initialization
Packet Transfer - Read and Write
Termination
13.3.1.1.1. Initialization¶
DpdkNetDeviceHelper
model is responsible for the initialization of DpdkNetDevice
. After this, the EAL is initialized, a memory pool is allocated, access to the Ethernet port is obtained and it is initialized, reception (Rx) and transmission (Tx) queues are set up on the port, Rx and Tx buffers are set up and LaunchCore method is called which will launch the HandleRx
method to handle reading of packets in burst.
13.3.1.1.2. Packet Transfer¶
DPDK interacts with packet in the form of mbuf, a data structure provided by it, while ns-3 interacts with packets in the form of raw buffer. The packet transfer functions take care of converting DPDK mbufs to ns-3 buffers. The functions are read and write.
Read:
HandleRx
method takes care of reading the packets from NIC and transferring them to ns-3 Internet Stack. This function is called byLaunchCore
method which is launched during initialization. It continuously polls the NIC using DPDK API for packets to read. It reads the mbuf packets in burst from NIC Rx ring, which are placed into Rx buffer upon read. For each mbuf packet in Rx buffer, it then converts it to ns-3 raw buffer and then forwards the packet to ns-3 Internet Stack.Write:
Write
method handles transmission of packets. ns-3 provides this packet in the form of a buffer, which is converted to packet mbuf and then placed in the Tx buffer. These packets are then transferred to NIC Tx ring when the Tx buffer is full, from where they will be transmitted by the NIC. However, there might be a scenario where there are not enough packets to fill the Tx buffer. This will lead to stale packet mbufs in buffer. In such cases, theWrite
function schedules a manual flush of these stale packet mbufs to NIC Tx ring, which will occur upon a certain timeout period. The default value of this timeout is set to2 ms
.
13.3.1.1.3. Termination¶
When ns-3 is done using DpdkNetDevice
, the DpdkNetDevice
will stop polling for Rx, free the allocated mbuf packets and then the mbuf pool. Lastly, it will stop the Ethernet device and close the port.
13.3.1.2. Scope and Limitations¶
The current implementation supports only one NIC to be bound to DPDK with single Rx and Tx on the NIC. This can be extended to support multiple NICs and multiple Rx/Tx queues simultaneously. Currently there is no support for Jumbo frames, which can be added. Offloading, scheduling features can also be added. Flow control and support for qdisc can be added to provide a more extensive model for network testing.
13.3.2. DPDK Installation¶
This section contains information on downloading DPDK source code and setting up DPDK for DpdkNetDevice
to work.
13.3.2.1. Is my NIC supported by DPDK?¶
Check Supported Devices.
13.3.2.2. Not supported? Use Virtual Machine instead¶
Install Oracle VM VirtualBox. Create a new VM and install Ubuntu on it. Open settings, create a network adapter with following configuration:
Attached to: Bridged Adapter
Name: The host network device you want to use
- In Advanced
Adapter Type: Intel PRO/1000 MT Server (82545EM) or any other DPDK supported NIC
Promiscuous Mode: Allow All
Select Cable Connected
Then rest of the steps are same as follows.
DPDK can be installed in 2 ways:
Install DPDK on Ubuntu
Compile DPDK from source
13.3.2.3. Install DPDK on Ubuntu¶
To install DPDK on Ubuntu, run the following command:
apt-get install dpdk dpdk-dev libdpdk-dev dpdk-igb-uio-dkms
Ubuntu 20.04 has packaged DPDK v19.11 LTS which is tested with this module and DpdkNetDevice will only be enabled if this version is available.
13.3.2.4. Compile from Source¶
To compile DPDK from source, you need to perform the following 4 steps:
13.3.2.4.1. 1. Download the source¶
Visit the DPDK Downloads page to download the latest stable source. (This module has been tested with version 19.11 LTS and DpdkNetDevice will only be enabled if this version is available.)
13.3.2.4.3. 3. Install the source¶
Refer to Installation for detailed instructions.
For a 64 bit linux machine with gcc, run:
make install T=x86_64-native-linuxapp-gcc DESTDIR=install
13.3.2.4.4. 4. Export DPDK Environment variables¶
Export the following environment variables:
RTE_SDK as the your DPDK source folder.
RTE_TARGET as the build target directory.
For example:
export RTE_SDK=/home/username/dpdk/dpdk-stable-19.11.1
export RTE_TARGET=x86_64-native-linuxapp-gcc
(Note: In case DPDK is moved, ns-3 needs to be reconfigured using ./ns3 configure [options]
)
It is advisable that you export these variables in .bashrc
or similar for reusability.
13.3.2.5. Load DPDK Drivers to kernel¶
Execute the following:
sudo modprobe uio_pci_generic
sudo modprobe uio
sudo modprobe vfio-pci
sudo modprobe igb_uio # for ubuntu package
# OR
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko # for dpdk source
These should be done every time you reboot your system.
13.3.2.6. Configure hugepages¶
Refer System Requirements for detailed instructions.
To allocate hugepages at runtime, write a value such as ‘256’ to the following:
echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
To allocate hugepages at boot time, edit /etc/default/grub
, and following to GRUB_CMDLINE_LINUX_DEFAULT
:
hugepages=256
We suggest minimum of number of 256
to run our applications. (This is to test an application run at 1 Gbps on a 1 Gbps NIC.) You can use any number of hugepages based on your system capacity and application requirements.
Then update the grub configurations using:
sudo update-grub
OR
sudo update-grub2
You will need to reboot your system in order to see these changes.
To check allocation of hugepages, run:
cat /proc/meminfo | grep HugePages
You will see the number of hugepages allocated, they should be equal to the number you used above.
Once the hugepage memory is reserved (at either runtime or boot time), to make the memory available for DPDK use, perform the following steps:
sudo mkdir /mnt/huge
sudo mount -t hugetlbfs nodev /mnt/huge
The mount point can be made permanent across reboots, by adding the following line to the /etc/fstab
file:
nodev /mnt/huge hugetlbfs defaults 0 0
13.3.3. Usage¶
The status of DPDK support is shown in the output of ./ns3 configure
. If
it is found, a user should see:
DPDK NetDevice : enabled
DpdkNetDeviceHelper
class supports the configuration of DpdkNetDevice
.
+----------------------+
| host 1 |
+----------------------+
| ns-3 simulation |
+----------------------+
| ns-3 Node |
| +----------------+ |
| | ns-3 TCP | |
| +----------------+ |
| | ns-3 IP | |
| +----------------+ |
| | DpdkNetDevice | |
| | 10.1.1.1 | |
| +----------------+ |
| | raw socket | |
|--+----------------+--|
| | eth0 | |
+-------+------+-------+
10.1.1.11
|
+-------------- ( Internet ) ----
Initialization of DPDK driver requires initialization of EAL. EAL requires PMD (Poll Mode Driver) Library for using NIC. DPDK supports multiple Poll Mode Drivers and you can use one that works for your NIC. PMD Library can be set via DpdkNetDeviceHelper::SetPmdLibrary
, as follows:
DpdkNetDeviceHelper* dpdk = new DpdkNetDeviceHelper();
dpdk->SetPmdLibrary("librte_pmd_e1000.so");
Also, NIC should be bound to DPDK Driver in order to be used with EAL. The default driver used is uio_pci_generic
which supports most of the NICs. You can change it using DpdkNetDeviceHelper::SetDpdkDriver
, as follows:
DpdkNetDeviceHelper* dpdk = new DpdkNetDeviceHelper();
dpdk->SetDpdkDriver("igb_uio");
13.3.3.1. Attributes¶
The DpdkNetDevice
provides a number of attributes:
TxTimeout
- The time to wait before transmitting burst from Tx Buffer (in us). (default -2000
) This attribute is only used to flush out buffer in case it is not filled. This attribute can be decrease for low data rate traffic. For high data rate traffic, this attribute needs no change.MaxRxBurst
- Size of Rx Burst. (default -64
) This attribute can be increased for higher data rates.MaxTxBurst
- Size of Tx Burst. (default -64
) This attribute can be increased for higher data rates.MempoolCacheSize
- Size of mempool cache. (default -256
) This attribute can be increased for higher data rates.NbRxDesc
- Number of Rx descriptors. (default -1024
) This attribute can be increased for higher data rates.NbTxDesc
- Number of Tx descriptors. (default -1024
) This attribute can be increased for higher data rates.
Note: Default values work well with 1Gbps traffic.
13.3.3.2. Output¶
As DpdkNetDevice
is inherited from FdNetDevice
, all the output methods provided by FdNetDevice
can be used directly.
13.3.3.3. Examples¶
The following examples are provided:
fd-emu-ping.cc
: This example can be configured to use theDpdkNetDevice
to send ICMP traffic bypassing the kernel over a real channel.fd-emu-onoff.cc
: This example can be configured to measure the throughput of theDpdkNetDevice
by sending traffic from the simulated node to a real device using thens3::OnOffApplication
while leveraging DPDK’s fast packet processing abilities. This is achieved by saturating the channel with TCP/UDP traffic.