1Net DIM - Generic Network Dynamic Interrupt Moderation 2====================================================== 3 4Author: 5 Tal Gilboa <talgi@mellanox.com> 6 7 8Contents 9========= 10 11- Assumptions 12- Introduction 13- The Net DIM Algorithm 14- Registering a Network Device to DIM 15- Example 16 17Part 0: Assumptions 18====================== 19 20This document assumes the reader has basic knowledge in network drivers 21and in general interrupt moderation. 22 23 24Part I: Introduction 25====================== 26 27Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the 28interrupt moderation configuration of a channel in order to optimize packet 29processing. The mechanism includes an algorithm which decides if and how to 30change moderation parameters for a channel, usually by performing an analysis on 31runtime data sampled from the system. Net DIM is such a mechanism. In each 32iteration of the algorithm, it analyses a given sample of the data, compares it 33to the previous sample and if required, it can decide to change some of the 34interrupt moderation configuration fields. The data sample is composed of data 35bandwidth, the number of packets and the number of events. The time between 36samples is also measured. Net DIM compares the current and the previous data and 37returns an adjusted interrupt moderation configuration object. In some cases, 38the algorithm might decide not to change anything. The configuration fields are 39the minimum duration (microseconds) allowed between events and the maximum 40number of wanted packets per event. The Net DIM algorithm ascribes importance to 41increase bandwidth over reducing interrupt rate. 42 43 44Part II: The Net DIM Algorithm 45=============================== 46 47Each iteration of the Net DIM algorithm follows these steps: 481. Calculates new data sample. 492. Compares it to previous sample. 503. Makes a decision - suggests interrupt moderation configuration fields. 514. Applies a schedule work function, which applies suggested configuration. 52 53The first two steps are straightforward, both the new and the previous data are 54supplied by the driver registered to Net DIM. The previous data is the new data 55supplied to the previous iteration. The comparison step checks the difference 56between the new and previous data and decides on the result of the last step. 57A step would result as "better" if bandwidth increases and as "worse" if 58bandwidth reduces. If there is no change in bandwidth, the packet rate is 59compared in a similar fashion - increase == "better" and decrease == "worse". 60In case there is no change in the packet rate as well, the interrupt rate is 61compared. Here the algorithm tries to optimize for lower interrupt rate so an 62increase in the interrupt rate is considered "worse" and a decrease is 63considered "better". Step #2 has an optimization for avoiding false results: it 64only considers a difference between samples as valid if it is greater than a 65certain percentage. Also, since Net DIM does not measure anything by itself, it 66assumes the data provided by the driver is valid. 67 68Step #3 decides on the suggested configuration based on the result from step #2 69and the internal state of the algorithm. The states reflect the "direction" of 70the algorithm: is it going left (reducing moderation), right (increasing 71moderation) or standing still. Another optimization is that if a decision 72to stay still is made multiple times, the interval between iterations of the 73algorithm would increase in order to reduce calculation overhead. Also, after 74"parking" on one of the most left or most right decisions, the algorithm may 75decide to verify this decision by taking a step in the other direction. This is 76done in order to avoid getting stuck in a "deep sleep" scenario. Once a 77decision is made, an interrupt moderation configuration is selected from 78the predefined profiles. 79 80The last step is to notify the registered driver that it should apply the 81suggested configuration. This is done by scheduling a work function, defined by 82the Net DIM API and provided by the registered driver. 83 84As you can see, Net DIM itself does not actively interact with the system. It 85would have trouble making the correct decisions if the wrong data is supplied to 86it and it would be useless if the work function would not apply the suggested 87configuration. This does, however, allow the registered driver some room for 88manoeuvre as it may provide partial data or ignore the algorithm suggestion 89under some conditions. 90 91 92Part III: Registering a Network Device to DIM 93============================================== 94 95Net DIM API exposes the main function net_dim(struct net_dim *dim, 96struct net_dim_sample end_sample). This function is the entry point to the Net 97DIM algorithm and has to be called every time the driver would like to check if 98it should change interrupt moderation parameters. The driver should provide two 99data structures: struct net_dim and struct net_dim_sample. Struct net_dim 100describes the state of DIM for a specific object (RX queue, TX queue, 101other queues, etc.). This includes the current selected profile, previous data 102samples, the callback function provided by the driver and more. 103Struct net_dim_sample describes a data sample, which will be compared to the 104data sample stored in struct net_dim in order to decide on the algorithm's next 105step. The sample should include bytes, packets and interrupts, measured by 106the driver. 107 108In order to use Net DIM from a networking driver, the driver needs to call the 109main net_dim() function. The recommended method is to call net_dim() on each 110interrupt. Since Net DIM has a built-in moderation and it might decide to skip 111iterations under certain conditions, there is no need to moderate the net_dim() 112calls as well. As mentioned above, the driver needs to provide an object of type 113struct net_dim to the net_dim() function call. It is advised for each entity 114using Net DIM to hold a struct net_dim as part of its data structure and use it 115as the main Net DIM API object. The struct net_dim_sample should hold the latest 116bytes, packets and interrupts count. No need to perform any calculations, just 117include the raw data. 118 119The net_dim() call itself does not return anything. Instead Net DIM relies on 120the driver to provide a callback function, which is called when the algorithm 121decides to make a change in the interrupt moderation parameters. This callback 122will be scheduled and run in a separate thread in order not to add overhead to 123the data flow. After the work is done, Net DIM algorithm needs to be set to 124the proper state in order to move to the next iteration. 125 126 127Part IV: Example 128================= 129 130The following code demonstrates how to register a driver to Net DIM. The actual 131usage is not complete but it should make the outline of the usage clear. 132 133my_driver.c: 134 135#include <linux/net_dim.h> 136 137/* Callback for net DIM to schedule on a decision to change moderation */ 138void my_driver_do_dim_work(struct work_struct *work) 139{ 140 /* Get struct net_dim from struct work_struct */ 141 struct net_dim *dim = container_of(work, struct net_dim, 142 work); 143 /* Do interrupt moderation related stuff */ 144 ... 145 146 /* Signal net DIM work is done and it should move to next iteration */ 147 dim->state = NET_DIM_START_MEASURE; 148} 149 150/* My driver's interrupt handler */ 151int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...) 152{ 153 ... 154 /* A struct to hold current measured data */ 155 struct net_dim_sample dim_sample; 156 ... 157 /* Initiate data sample struct with current data */ 158 net_dim_sample(my_entity->events, 159 my_entity->packets, 160 my_entity->bytes, 161 &dim_sample); 162 /* Call net DIM */ 163 net_dim(&my_entity->dim, dim_sample); 164 ... 165} 166 167/* My entity's initialization function (my_entity was already allocated) */ 168int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...) 169{ 170 ... 171 /* Initiate struct work_struct with my driver's callback function */ 172 INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work); 173 ... 174} 175