This document is a starting point for defining the TSO and GSO features. The whole thing is starting to get a bit messy so I wanted to make sure we have notes somwhere to start describing what does and doesn't work. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>tirimbino
parent
802ab55adc
commit
f7a6272bf3
@ -0,0 +1,130 @@ |
||||
Segmentation Offloads in the Linux Networking Stack |
||||
|
||||
Introduction |
||||
============ |
||||
|
||||
This document describes a set of techniques in the Linux networking stack |
||||
to take advantage of segmentation offload capabilities of various NICs. |
||||
|
||||
The following technologies are described: |
||||
* TCP Segmentation Offload - TSO |
||||
* UDP Fragmentation Offload - UFO |
||||
* IPIP, SIT, GRE, and UDP Tunnel Offloads |
||||
* Generic Segmentation Offload - GSO |
||||
* Generic Receive Offload - GRO |
||||
* Partial Generic Segmentation Offload - GSO_PARTIAL |
||||
|
||||
TCP Segmentation Offload |
||||
======================== |
||||
|
||||
TCP segmentation allows a device to segment a single frame into multiple |
||||
frames with a data payload size specified in skb_shinfo()->gso_size. |
||||
When TCP segmentation requested the bit for either SKB_GSO_TCP or |
||||
SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and |
||||
skb_shinfo()->gso_size should be set to a non-zero value. |
||||
|
||||
TCP segmentation is dependent on support for the use of partial checksum |
||||
offload. For this reason TSO is normally disabled if the Tx checksum |
||||
offload for a given device is disabled. |
||||
|
||||
In order to support TCP segmentation offload it is necessary to populate |
||||
the network and transport header offsets of the skbuff so that the device |
||||
drivers will be able determine the offsets of the IP or IPv6 header and the |
||||
TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should |
||||
also point to the TCP header of the packet. |
||||
|
||||
For IPv4 segmentation we support one of two types in terms of the IP ID. |
||||
The default behavior is to increment the IP ID with every segment. If the |
||||
GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP |
||||
ID and all segments will use the same IP ID. If a device has |
||||
NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO |
||||
and we will either increment the IP ID for all frames, or leave it at a |
||||
static value based on driver preference. |
||||
|
||||
UDP Fragmentation Offload |
||||
========================= |
||||
|
||||
UDP fragmentation offload allows a device to fragment an oversized UDP |
||||
datagram into multiple IPv4 fragments. Many of the requirements for UDP |
||||
fragmentation offload are the same as TSO. However the IPv4 ID for |
||||
fragments should not increment as a single IPv4 datagram is fragmented. |
||||
|
||||
IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads |
||||
======================================================== |
||||
|
||||
In addition to the offloads described above it is possible for a frame to |
||||
contain additional headers such as an outer tunnel. In order to account |
||||
for such instances an additional set of segmentation offload types were |
||||
introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and |
||||
SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify |
||||
cases where there are more than just 1 set of headers. For example in the |
||||
case of IPIP and SIT we should have the network and transport headers moved |
||||
from the standard list of headers to "inner" header offsets. |
||||
|
||||
Currently only two levels of headers are supported. The convention is to |
||||
refer to the tunnel headers as the outer headers, while the encapsulated |
||||
data is normally referred to as the inner headers. Below is the list of |
||||
calls to access the given headers: |
||||
|
||||
IPIP/SIT Tunnel: |
||||
Outer Inner |
||||
MAC skb_mac_header |
||||
Network skb_network_header skb_inner_network_header |
||||
Transport skb_transport_header |
||||
|
||||
UDP/GRE Tunnel: |
||||
Outer Inner |
||||
MAC skb_mac_header skb_inner_mac_header |
||||
Network skb_network_header skb_inner_network_header |
||||
Transport skb_transport_header skb_inner_transport_header |
||||
|
||||
In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and |
||||
SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the |
||||
fact that the outer header also requests to have a non-zero checksum |
||||
included in the outer header. |
||||
|
||||
Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header |
||||
has requested a remote checksum offload. In this case the inner headers |
||||
will be left with a partial checksum and only the outer header checksum |
||||
will be computed. |
||||
|
||||
Generic Segmentation Offload |
||||
============================ |
||||
|
||||
Generic segmentation offload is a pure software offload that is meant to |
||||
deal with cases where device drivers cannot perform the offloads described |
||||
above. What occurs in GSO is that a given skbuff will have its data broken |
||||
out over multiple skbuffs that have been resized to match the MSS provided |
||||
via skb_shinfo()->gso_size. |
||||
|
||||
Before enabling any hardware segmentation offload a corresponding software |
||||
offload is required in GSO. Otherwise it becomes possible for a frame to |
||||
be re-routed between devices and end up being unable to be transmitted. |
||||
|
||||
Generic Receive Offload |
||||
======================= |
||||
|
||||
Generic receive offload is the complement to GSO. Ideally any frame |
||||
assembled by GRO should be segmented to create an identical sequence of |
||||
frames using GSO, and any sequence of frames segmented by GSO should be |
||||
able to be reassembled back to the original by GRO. The only exception to |
||||
this is IPv4 ID in the case that the DF bit is set for a given IP header. |
||||
If the value of the IPv4 ID is not sequentially incrementing it will be |
||||
altered so that it is when a frame assembled via GRO is segmented via GSO. |
||||
|
||||
Partial Generic Segmentation Offload |
||||
==================================== |
||||
|
||||
Partial generic segmentation offload is a hybrid between TSO and GSO. What |
||||
it effectively does is take advantage of certain traits of TCP and tunnels |
||||
so that instead of having to rewrite the packet headers for each segment |
||||
only the inner-most transport header and possibly the outer-most network |
||||
header need to be updated. This allows devices that do not support tunnel |
||||
offloads or tunnel offloads with checksum to still make use of segmentation. |
||||
|
||||
With the partial offload what occurs is that all headers excluding the |
||||
inner transport header are updated such that they will contain the correct |
||||
values for if the header was simply duplicated. The one exception to this |
||||
is the outer IPv4 ID field. It is up to the device drivers to guarantee |
||||
that the IPv4 ID field is incremented in the case that a given header does |
||||
not have the DF bit set. |
Loading…
Reference in new issue