One 、 Preface
Just started to touch DPDK, Find a lot of concepts , It's hard to understand , Write down the key points in this article , Be right dpdk We really understand the basics of . In this way, it can be used to write programs later , Can't catch up with the schedule , The more you catch up, the slower you may be , Take your time . This article is mainly to understand , Refer to many articles , If there is anything you don't understand, just check , It can't be profound , Only understand the meaning . The article is a compilation , Refer to several articles , If there is any infringement , Please inform , thank you ！
Two 、 Overall understanding
history ： As the number of computer cores increases , The increase of network bandwidth , The requirement of network packet processing performance is higher and higher , But today's operating systems are inefficient in handling network packets . Inefficiency is manifested in ： 1） After the network packet comes, it will be informed by interrupt mode , and cpu The ability to handle interrupts is certain , If there are a lot of small packets in the network , It caused the network congestion ,cpu Not dealt with in time . 【 before cpu The frequency is much higher than that of network devices , So interruptions are very effective 】 2） The protocol stack of the operating system is a single core processing , There's no way to use the multicore of the operating system today . 3） Network packets from network card to kernel space , Then to user space , Made multiple copies of the data , Poor performance . DPDK Full name Data Plane Development Kit Software Development Suite focused on data side , It's for Intel Network chip development of , To run on Linux and FreeBsd On . DPDK It has changed the traditional way of handling network packets , Handle directly in user space , Here is the following ：
Tradition VSDPDK How to grab the bag
3、 ... and 、 Understanding important concepts
It says DPDK The main concepts in the document , In addition, how to match the concept with the actual parameters on our own machine .
3.1 PPS： Packet forwarding rate
namely 1s Can send multiple frame、 In Ethernet, it's Ethernet frame , We often say that the interface bandwidth is 1Gbits/s 、10Gbits/s Represents the highest rate that the Ethernet interface can transmit , Unit is （bit per second position / second ） actually , During transmission , There's space between frames （12 Bytes ）, Each frame is preceded by a preamble （7 Bytes ）、 Frame header qualifier （1 Bytes ）. Frame theoretical forwarding rate = BitRate/8 / ( Frame preamble + Frame spacing + Frame header qualifier + packet length ）
Structure in Ethernet frame transmission
according to 10Gbits/s （ If you remember, 10 trillion fiber ） To calculate 64 The forwarding rate of packets in bytes .
Minimum frame size
101024102410241024/(12+7+1+64) 8 About equal to 1000M10 /(12+7+1+64) *8 = 14.880952380952381 M/PPS （ Million packets ） That is to say 1s Can send 1 thousand 400 Million packets . Be careful , Inside Data The length is in 46-1500 Between bytes , So the minimum frame length is ： 6+6+2+46+4 = 64 Bytes . Linear velocity ： The maximum speed that the network card or network supports . Summary data ：
Speed limit of network card
arrival For the time interval between each packet . rte：runtime environment The operating environment . eal： environment abstraction layer The abstract environment layer .
DPDK Learning route and video explanation +qun720209036 obtain
**1.dpdk PCI Principle and testpmd/l3fwd/skeletion
2.kni Data flow
3.dpdk Realization dns
4.dpdk Implementation of high performance gateway
5. Semi virtualization virtio/vhost The acceleration of
3.2 UIO： User space IO
Small kernel modules , Used to map device memory to user space , And the registration was interrupted . uio_pci_generic by linux The kernel module , Provide this function , Can pass modprobe uio_pci_generic load . But it doesn't support virtual functions ,DPDK, Provide an alternative module igb_uio modular , adopt sudo modprobe uio sudo insmod kmod/igb_uio.ko The command to load .
VFIO It's a safe device to handle IO、 interrupt 、DMA Etc. exposed to user space （usespace）, In order to complete the device driven framework in user space . User space direct access device , The allocation of virtual devices can get higher IO performance . Reference resources （https://blog.csdn.net/wentyoo...） sudo modprobe vfio-pci The command to load vfio drive . 1. Put two 82599 Ethernet is bound to VFIO ./tools/dpdk_nic_bind.py -b vfio-pci 03：00.0 03：00.1 3. take 82599 ehter Bound to the IGB_UIO ./tools/dpdk_nic_bind.py -b igb_uio 03：00.0 03：00.1 For reference. ：http://www.cnblogs.com/vancas... To configure vfio Drive mode . Both are network card driver modules in user space , It's just said that UIO rely on IOMMU,VFIO Better performance , More secure , But it has to be systematic and BSIO Support Check the binding status through the tool ：
Speed limit of network card
explain ： above driv Who explains the network card driver in use , Back unused Not using compatible network card driver . Bind command ： ./dpdk-devbind.py --bind=ixgbe 01:00.0
Binding network card and driver
Pay attention to DPDK Under the driving condition of , use ifconfig You can't see the network card .
PMD, Poll Mode Driver Polling drive mode ,DPDK Replace the interrupt mode with this polling mode
RSS(Receive Side Scaling) It is a kind of multi processor system that can make the received message in multiple CPU Efficient distribution between the network card driver technology .
The network card analyzes the received message , obtain IP Address 、 Protocol and port quintuple information Network card through the configuration of HASH Function calculates from five tuple information HASH value , It can also be based on two 、 Three or four tuples to calculate . take HASH The lower of the value ( This specific network card may be different ) As RETA(redirection table) The index of according to RETA The values stored in are distributed to the corresponding CPU DPDK Support setting static hash Value and configuration RETA. however DPDK in RSS It's port based , And according to the receiving queue of the port to distribute packets . For example, we have configured 3 Receive queues (0,1,2) And turned on RSS, that That's what happened in China :
Running in different CPU The application receives messages from different receiving queues , In this way, the effect of message distribution is achieved . stay DPDK By setting rte_eth_conf Medium mq_mode Field to open RSS function , rx_mode.mq_mode = ETH_MQ_RX_RSS. When RSS After the function is turned on , Message corresponding to rte_pktmbuf There will be RSS Calculated hash value , Can pass pktmbuf.hash.rss To visit . This value can be directly used in subsequent message processing without recalculation hash value , Such as fast forwarding , Identify message flow, etc .
3.7 symmetry RSS
In network applications , If the two-way message of the same connection is on RSS And then distributed to the same CPU Upper processing , such RSS It's called symmetry RSS. DPDK Of hash The algorithm can't do this , Yes, we need to parse http message , Then request and access if you use normal rss This leads to the problem that the sending and returning messages cannot match , If dpdk To support it, you need to replace it Hash Algorithm .
3.8 NUMA framework
NUMA(Non-Uniform Memory Architecture Inconsistent memory architecture ） System . The feature is that each processor has local memory 、 Access to local memory blocks , Access to the memory corresponding to other processors needs to be through the bus , slow .
Classic computer architecture
3.9 Hugepages Large page memory
Operating system , Memory is allocated on a page by page basis , The size of the page is usually 4kB, If the page size is fixed, the greater the memory , The more pages there are , The slower the multi-level memory access is ,TLB It is faster to access memory by using the , however TLB There are not many page items stored , So we need to reduce the number of pages , Then increase the page size , Increase the memory page size to 2MB or 1GB etc. . DPDK It is mainly divided into 2M and 1G Two kinds of pages , Specific support depends on CPU, It can be downloaded from cpu Of flags It can be seen from inside , for instance ： If flags There are pse identification , Identity support 2M Large memory pages of ; If there is pdge1gb identification , State support for 1G Large memory pages for .
cpu Big page support for
Check the memory page information
Four Important module division
The following are the important kernel module divisions .
Important module division