Friday, 27 September 2013

Network driver descriptors: ath9k as reference

DMAing is one of the most important parts to understand for any network driver developer. In this post, we will try to explore few details of DMA in Tx  path. We will use ath9k as the reference driver. This should be similar for ath5k, madwifi or any other Atheros driver including Fusion and Aquila.

Before going into the details, let me explore the components involved. Mainly we have the host (on which the network driver is running) and the MAC (part of firmware in the wireless card). MAC consists of different sub components including Queue Control Unit (QCU) and DCF Control Unit (DCU). In Tx path the frame transmission begins at QCU and later passed to corresponding DCU for transmission into air. There exists one QCU for each category. Different categories include, Best Effort, Background, Video, Voice, Beacon, UAPSD and CAB.

 MAC is responsible for  transferring frames between host memory and card. All the transfers happen using a structure called descriptor. Host creates, populates and provides the descriptors to MAC for further processing by MAC. Please note that the MAC needs the physical addresses and not the virtual addresses. Hence the host need to map the virtual addresses of the descriptor before passing it to the MAC.

Descriptor:

Descriptors are more like device specific. While passing the descriptors to the MAC, they are maintained as linked list. So  descriptor structure should contain  a pointer to the next descriptor. MAC processes the list of descriptors in the order and raises an interrupt (TXEOL) at the end of the list.

Also, the tx descriptor should contain the physical address of the actual buffer to be transmitted.

Descriptors not only useful for passing the frame to the MAC, but also for fetching the tx and rx status after the transmission and reception consecutively. However this information is fetched through a call to the HAL layer. For example, please see the definition of ath_tx_edma_tasklet in ath9k driver. This information is fetched from HAL layer using the function, ath9k_hw_txprocdesc.

As an example descriptor, please see below the descriptor used for Atheros based cards. It contains 24 32-bit words. This definition is from ath9k driver. From the definition we can see that we can specify multiple buffers in a single descriptor. However as of now, the current implementation passes only one buffer.

/* Transmit Control Descriptor */
struct ar9003_txc {
        u32 info;   /* descriptor information */
        u32 link;   /* link pointer */
        u32 data0;  /* data pointer to 1st buffer */
        u32 ctl3;   /* DMA control 3  */
        u32 data1;  /* data pointer to 2nd buffer */
        u32 ctl5;   /* DMA control 5  */
        u32 data2;  /* data pointer to 3rd buffer */
        u32 ctl7;   /* DMA control 7  */
        u32 data3;  /* data pointer to 4th buffer */
        u32 ctl9;   /* DMA control 9  */
        u32 ctl10;  /* DMA control 10 */
        u32 ctl11;  /* DMA control 11 */
        u32 ctl12;  /* DMA control 12 */
        u32 ctl13;  /* DMA control 13 */
        u32 ctl14;  /* DMA control 14 */
        u32 ctl15;  /* DMA control 15 */
        u32 ctl16;  /* DMA control 16 */
        u32 ctl17;  /* DMA control 17 */
        u32 ctl18;  /* DMA control 18 */
        u32 ctl19;  /* DMA control 19 */
        u32 ctl20;  /* DMA control 20 */
        u32 ctl21;  /* DMA control 21 */
        u32 ctl22;  /* DMA control 22 */
        u32 ctl23;  /* DMA control 23 */
        u32 pad[8]; /* pad to cache line (128 bytes/32 dwords) */
} __packed __aligned(4);


In ath9k and other Atheros based drivers , between the actual frame (skb) and descriptor, there is an abstraction called, "struct ath_buf". This is an intermediary structure holding the important data that should be accessed  before and after transmission. This structure encapsulates data like physical and virtual addresses of the descriptor, physical address of the frame (skb->data) and details like pointer to station information.

Creation of Descriptors:

Descriptors are created once and should be consistent between host and device access. Hence it is better to use consistent (coherent) DMA mapping. Corresponding calls are pci_alloc_consistent, dma_alloc_coherent and dmam_alloc_coherent.

As an example, please refer the definition of the function ath_descdma_setup in any of the Atheros based drivers. In ath9k,  memory is allocated using dmam_alloc_coherent. In some other drivers the memory is allocated using pci_alloc_consistent. Please observe that multiple number of descriptors are allocated.

Populate the descriptor:

When a frame is received by the driver, its physical address (after dma mapping) should be populated into a descriptor and the address of the descriptor should be passed to the hardware.

Please see the definition of the function ath_tx_setup_buffer in ath9k driver.  First an ath_buf is dequeued from the list of free buffers and the frame (skb) is dma mapped using the function call, dma_map_single.

Please note that we are using dma_map_single for skbs and dmam_alloc_coherent for  descriptors. dma_map_single is streaming DMA routine. Generally once the skb is mapped, host does not modify any of its contents (Except in some special cases like UAPSD). Hence streaming DMA is fine for mapping the skbs.

Please also observe that the physical address is saved in one of the fields (bf_buf_addr)  of ath_buf which will be used later.  Actual values of the descriptor are populated in HAL related functions. One such function is ar9003_set_txdesc. This function is invoked as a function pointer from ath9k_hw_set_txdesc which is invoked from ath_tx_fill_desc (in some drivers the corresponding function might be ath_hal_filltxdesc)  which in turn is invoked from ath_tx_send_normal.  You can see that the physical address of the frame (bf_buf_addr) is used here and populated into the corresponding field in the descriptor.

Pass the descriptor to the hardware:

Once the descriptor is populated, it is passed to the hardware.  In ath9k or other Atheros drivers, corresponding function is ath_tx_txqaddbuf. There are two different paths here. 

In case of enhanced DMA, the descriptor is directly given to the corresponding queue. Corresponding function call in ath9k is ath9k_hw_puttxbuf. Please observe that the queue number and physical address  of the descriptor (bf->bf_daddr) are passed to this function.

If the enhanced DMA is not supported and if the queue is empty then the descriptor is directly passed to the corresponding queue. In case the queue is not empty, the descriptor is appended to the queue. In ath9k driver, you can see the invocation of ath9k_hw_set_desc_link to append the frames to the tx queue. The frames in the queue are processed in FIFO order.




7 comments:

  1. excellent work Venkat Ch, i really appreciate the quality of detailed description you have given.

    ReplyDelete
  2. Hello Venkat Ch, i am in wifi device driver development(ath9k rit now) field and just now i have started learning. For ease i was tracing the call flow and was trying to figure out how internally this stuff works. rit now i m facing a lil bit difficulty in understanding the code, it is really very confusing and difficult to trace it, if possible then can u plz help in out by giving the description like above and make me understand how functional flow goes from user space to kernel space. i have taken "iw" command as reference and tried tracing functional flow but i am not understanding much... so plz help

    ReplyDelete
    Replies
    1. Hi Gaurav, I am a new comer for wifi device driver development. As you mentioned, you can trace the functional flow using iw command. I am curious about what kind of development tool you are using to do this? Thanks so much in advance!

      Delete
  3. Hi Gaurav,

    Thank you for the comment.

    iw uses Netlink sockets to communicate with the driver. At the driver side corresponding components are nl80211/cfg80211. Please read about Netlink sockets and in parallel grep for corresponding snippets in nl80211/cfg80211. It will help to get started.

    Thanks & Regards
    Venkat

    ReplyDelete
    Replies
    1. Hi Venkat,

      Thanks for replying, actually i tried searching for some kind of documentation related to nl80211/cfg80211 but i didnt get much information, if you have any useful site or some softcopy documents please share with me if you dont mind

      Delete
    2. Hi Gaurav,
      Sorry. Unfortunately I am not aware of any documentation.

      Delete
  4. hi
    i am working over frame aggregation in the 80211n,i have a question in the mac80211 and ath9k driver : how AP select next station for forwarding their packets when multi stations has packet in the AP ?
    if all packet for all station after commint to the driver queuing after scheduling packet or scheduling station for forward packet?

    thanks

    ReplyDelete