Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

skbuff: rewrite the doc for data-only skbs

The comment about shinfo->dataref split is really unhelpful,
at least to me. Rewrite it and render it to skb documentation.

Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+35 -10
+1
Documentation/networking/index.rst
··· 97 97 sctp 98 98 secid 99 99 seg6-sysctl 100 + skbuff 100 101 smc-sysctl 101 102 statistics 102 103 strparser
+6
Documentation/networking/skbuff.rst
··· 23 23 get copied, but caller gets a new metadata struct (struct sk_buff). 24 24 &skb_shared_info.refcount indicates the number of skbs pointing at the same 25 25 packet data (i.e. clones). 26 + 27 + dataref and headerless skbs 28 + --------------------------- 29 + 30 + .. kernel-doc:: include/linux/skbuff.h 31 + :doc: dataref and headerless skbs
+28 -10
include/linux/skbuff.h
··· 727 727 skb_frag_t frags[MAX_SKB_FRAGS]; 728 728 }; 729 729 730 - /* We divide dataref into two halves. The higher 16 bits hold references 731 - * to the payload part of skb->data. The lower 16 bits hold references to 732 - * the entire skb->data. A clone of a headerless skb holds the length of 733 - * the header in skb->hdr_len. 730 + /** 731 + * DOC: dataref and headerless skbs 734 732 * 735 - * All users must obey the rule that the skb->data reference count must be 736 - * greater than or equal to the payload reference count. 733 + * Transport layers send out clones of payload skbs they hold for 734 + * retransmissions. To allow lower layers of the stack to prepend their headers 735 + * we split &skb_shared_info.dataref into two halves. 736 + * The lower 16 bits count the overall number of references. 737 + * The higher 16 bits indicate how many of the references are payload-only. 738 + * skb_header_cloned() checks if skb is allowed to add / write the headers. 737 739 * 738 - * Holding a reference to the payload part means that the user does not 739 - * care about modifications to the header part of skb->data. 740 + * The creator of the skb (e.g. TCP) marks its skb as &sk_buff.nohdr 741 + * (via __skb_header_release()). Any clone created from marked skb will get 742 + * &sk_buff.hdr_len populated with the available headroom. 743 + * If there's the only clone in existence it's able to modify the headroom 744 + * at will. The sequence of calls inside the transport layer is:: 745 + * 746 + * <alloc skb> 747 + * skb_reserve() 748 + * __skb_header_release() 749 + * skb_clone() 750 + * // send the clone down the stack 751 + * 752 + * This is not a very generic construct and it depends on the transport layers 753 + * doing the right thing. In practice there's usually only one payload-only skb. 754 + * Having multiple payload-only skbs with different lengths of hdr_len is not 755 + * possible. The payload-only skbs should never leave their owner. 740 756 */ 741 757 #define SKB_DATAREF_SHIFT 16 742 758 #define SKB_DATAREF_MASK ((1 << SKB_DATAREF_SHIFT) - 1) ··· 2043 2027 } 2044 2028 2045 2029 /** 2046 - * __skb_header_release - release reference to header 2047 - * @skb: buffer to operate on 2030 + * __skb_header_release() - allow clones to use the headroom 2031 + * @skb: buffer to operate on 2032 + * 2033 + * See "DOC: dataref and headerless skbs". 2048 2034 */ 2049 2035 static inline void __skb_header_release(struct sk_buff *skb) 2050 2036 {