[PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags

HexRabbit posted 1 patch 4 hours ago
net/ipv4/esp4.c       | 3 ++-
net/ipv4/ip_output.c  | 2 ++
net/ipv6/esp6.c       | 3 ++-
net/ipv6/ip6_output.c | 2 ++
4 files changed, 8 insertions(+), 2 deletions(-)
[PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags
Posted by HexRabbit 4 hours ago
From: Kuan-Ting Chen <h3xrabbit@gmail.com>

MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
so later paths that may modify packet data can first make a private
copy. The IPv4/IPv6 datagram append paths did not set this flag when
splicing pages into UDP skbs.

That leaves an ESP-in-UDP packet made from shared pipe pages looking
like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
fast path for uncloned skbs without a frag_list and decrypts in place
over data that is not owned privately by the skb.

Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
TCP. Also make ESP input fall back to skb_cow_data() when the flag is
present, so ESP does not decrypt externally backed frags in place.
Private nonlinear skb frags still use the existing fast path.

This intentionally does not change ESP output. In esp_output_head(),
the path that appends the ESP trailer to existing skb tailroom without
calling skb_cow_data() is not reachable for nonlinear skbs:
skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
tailen is positive. Thus ESP output will either use the separate
destination-frag path or fall back to skb_cow_data().

Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
---
 net/ipv4/esp4.c       | 3 ++-
 net/ipv4/ip_output.c  | 2 ++
 net/ipv6/esp6.c       | 3 ++-
 net/ipv6/ip6_output.c | 2 ++
 4 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 6dfc0bcde..6a5febbdb 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -873,7 +873,8 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb)
 			nfrags = 1;
 
 			goto skip_cow;
-		} else if (!skb_has_frag_list(skb)) {
+		} else if (!skb_has_frag_list(skb) &&
+			   !skb_has_shared_frag(skb)) {
 			nfrags = skb_shinfo(skb)->nr_frags;
 			nfrags++;
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e4790cc7b..5bcd73cbd 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1233,6 +1233,8 @@ static int __ip_append_data(struct sock *sk,
 			if (err < 0)
 				goto error;
 			copy = err;
+			if (!(flags & MSG_NO_SHARED_FRAGS))
+				skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
 			wmem_alloc_delta += copy;
 		} else if (!zc) {
 			int i = skb_shinfo(skb)->nr_frags;
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9f7531373..9c06c5a14 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -915,7 +915,8 @@ static int esp6_input(struct xfrm_state *x, struct sk_buff *skb)
 			nfrags = 1;
 
 			goto skip_cow;
-		} else if (!skb_has_frag_list(skb)) {
+		} else if (!skb_has_frag_list(skb) &&
+			   !skb_has_shared_frag(skb)) {
 			nfrags = skb_shinfo(skb)->nr_frags;
 			nfrags++;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7e92909ab..1f2a33fbe 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1794,6 +1794,8 @@ static int __ip6_append_data(struct sock *sk,
 			if (err < 0)
 				goto error;
 			copy = err;
+			if (!(flags & MSG_NO_SHARED_FRAGS))
+				skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
 			wmem_alloc_delta += copy;
 		} else if (!zc) {
 			int i = skb_shinfo(skb)->nr_frags;
-- 
2.43.0
Re: [PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags
Posted by Steffen Klassert 4 hours ago
We have antoher patch that addresses this issue in a different way,
so Cc the author of the other patch.

On Mon, May 04, 2026 at 03:34:03PM +0800, HexRabbit wrote:
> From: Kuan-Ting Chen <h3xrabbit@gmail.com>
> 
> MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
> marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
> so later paths that may modify packet data can first make a private
> copy. The IPv4/IPv6 datagram append paths did not set this flag when
> splicing pages into UDP skbs.
> 
> That leaves an ESP-in-UDP packet made from shared pipe pages looking
> like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
> fast path for uncloned skbs without a frag_list and decrypts in place
> over data that is not owned privately by the skb.
> 
> Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
> TCP. Also make ESP input fall back to skb_cow_data() when the flag is
> present, so ESP does not decrypt externally backed frags in place.
> Private nonlinear skb frags still use the existing fast path.
> 
> This intentionally does not change ESP output. In esp_output_head(),
> the path that appends the ESP trailer to existing skb tailroom without
> calling skb_cow_data() is not reachable for nonlinear skbs:
> skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
> tailen is positive. Thus ESP output will either use the separate
> destination-frag path or fall back to skb_cow_data().
> 
> Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
> ---
>  net/ipv4/esp4.c       | 3 ++-
>  net/ipv4/ip_output.c  | 2 ++
>  net/ipv6/esp6.c       | 3 ++-
>  net/ipv6/ip6_output.c | 2 ++
>  4 files changed, 8 insertions(+), 2 deletions(-)

This looks ok to me. From the IPsec point of view, I'm
fine with this patch, but it also touches generic
networking code. So I'd like to hear an opinion of one
of the networking maintainers before proceeding.
Re: [PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags
Posted by Eric Dumazet 3 hours ago
On Mon, May 4, 2026 at 12:53 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> We have antoher patch that addresses this issue in a different way,
> so Cc the author of the other patch.
>
> On Mon, May 04, 2026 at 03:34:03PM +0800, HexRabbit wrote:
> > From: Kuan-Ting Chen <h3xrabbit@gmail.com>
> >
> > MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
> > marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
> > so later paths that may modify packet data can first make a private
> > copy. The IPv4/IPv6 datagram append paths did not set this flag when
> > splicing pages into UDP skbs.
> >
> > That leaves an ESP-in-UDP packet made from shared pipe pages looking
> > like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
> > fast path for uncloned skbs without a frag_list and decrypts in place
> > over data that is not owned privately by the skb.
> >
> > Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
> > TCP. Also make ESP input fall back to skb_cow_data() when the flag is
> > present, so ESP does not decrypt externally backed frags in place.
> > Private nonlinear skb frags still use the existing fast path.
> >
> > This intentionally does not change ESP output. In esp_output_head(),
> > the path that appends the ESP trailer to existing skb tailroom without
> > calling skb_cow_data() is not reachable for nonlinear skbs:
> > skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
> > tailen is positive. Thus ESP output will either use the separate
> > destination-frag path or fall back to skb_cow_data().
> >
> > Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
> > ---
> >  net/ipv4/esp4.c       | 3 ++-
> >  net/ipv4/ip_output.c  | 2 ++
> >  net/ipv6/esp6.c       | 3 ++-
> >  net/ipv6/ip6_output.c | 2 ++
> >  4 files changed, 8 insertions(+), 2 deletions(-)
>
> This looks ok to me. From the IPsec point of view, I'm
> fine with this patch, but it also touches generic
> networking code. So I'd like to hear an opinion of one
> of the networking maintainers before proceeding.

I have not seen a Fixes: tag.

Do we need to split this patch into two parts?
Re: [PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags
Posted by Steffen Klassert 3 hours ago
On Mon, May 04, 2026 at 12:56:50AM -0700, Eric Dumazet wrote:
> On Mon, May 4, 2026 at 12:53 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > We have antoher patch that addresses this issue in a different way,
> > so Cc the author of the other patch.
> >
> > On Mon, May 04, 2026 at 03:34:03PM +0800, HexRabbit wrote:
> > > From: Kuan-Ting Chen <h3xrabbit@gmail.com>
> > >
> > > MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
> > > marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
> > > so later paths that may modify packet data can first make a private
> > > copy. The IPv4/IPv6 datagram append paths did not set this flag when
> > > splicing pages into UDP skbs.
> > >
> > > That leaves an ESP-in-UDP packet made from shared pipe pages looking
> > > like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
> > > fast path for uncloned skbs without a frag_list and decrypts in place
> > > over data that is not owned privately by the skb.
> > >
> > > Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
> > > TCP. Also make ESP input fall back to skb_cow_data() when the flag is
> > > present, so ESP does not decrypt externally backed frags in place.
> > > Private nonlinear skb frags still use the existing fast path.
> > >
> > > This intentionally does not change ESP output. In esp_output_head(),
> > > the path that appends the ESP trailer to existing skb tailroom without
> > > calling skb_cow_data() is not reachable for nonlinear skbs:
> > > skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
> > > tailen is positive. Thus ESP output will either use the separate
> > > destination-frag path or fall back to skb_cow_data().
> > >
> > > Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
> > > ---
> > >  net/ipv4/esp4.c       | 3 ++-
> > >  net/ipv4/ip_output.c  | 2 ++
> > >  net/ipv6/esp6.c       | 3 ++-
> > >  net/ipv6/ip6_output.c | 2 ++
> > >  4 files changed, 8 insertions(+), 2 deletions(-)
> >
> > This looks ok to me. From the IPsec point of view, I'm
> > fine with this patch, but it also touches generic
> > networking code. So I'd like to hear an opinion of one
> > of the networking maintainers before proceeding.
> 
> I have not seen a Fixes: tag.

Right, we need a v2 with a Fixes tag, and maybe also
'Cc: stable@vger.kernel.org'

> Do we need to split this patch into two parts?

I don't think we need to spilt it, we can merge it
either to the net or the ipsec tree. Both should
be OK.
Re: [PATCH net] xfrm: esp: avoid in-place decrypt on shared skb frags
Posted by Hyunwoo Kim 3 hours ago
On Mon, May 04, 2026 at 10:06:18AM +0200, Steffen Klassert wrote:
> On Mon, May 04, 2026 at 12:56:50AM -0700, Eric Dumazet wrote:
> > On Mon, May 4, 2026 at 12:53 AM Steffen Klassert
> > <steffen.klassert@secunet.com> wrote:
> > >
> > > We have antoher patch that addresses this issue in a different way,
> > > so Cc the author of the other patch.
> > >
> > > On Mon, May 04, 2026 at 03:34:03PM +0800, HexRabbit wrote:
> > > > From: Kuan-Ting Chen <h3xrabbit@gmail.com>
> > > >
> > > > MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
> > > > marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
> > > > so later paths that may modify packet data can first make a private
> > > > copy. The IPv4/IPv6 datagram append paths did not set this flag when
> > > > splicing pages into UDP skbs.
> > > >
> > > > That leaves an ESP-in-UDP packet made from shared pipe pages looking
> > > > like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
> > > > fast path for uncloned skbs without a frag_list and decrypts in place
> > > > over data that is not owned privately by the skb.
> > > >
> > > > Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
> > > > TCP. Also make ESP input fall back to skb_cow_data() when the flag is
> > > > present, so ESP does not decrypt externally backed frags in place.
> > > > Private nonlinear skb frags still use the existing fast path.
> > > >
> > > > This intentionally does not change ESP output. In esp_output_head(),
> > > > the path that appends the ESP trailer to existing skb tailroom without
> > > > calling skb_cow_data() is not reachable for nonlinear skbs:
> > > > skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
> > > > tailen is positive. Thus ESP output will either use the separate
> > > > destination-frag path or fall back to skb_cow_data().
> > > >
> > > > Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
> > > > ---
> > > >  net/ipv4/esp4.c       | 3 ++-
> > > >  net/ipv4/ip_output.c  | 2 ++
> > > >  net/ipv6/esp6.c       | 3 ++-
> > > >  net/ipv6/ip6_output.c | 2 ++
> > > >  4 files changed, 8 insertions(+), 2 deletions(-)
> > >
> > > This looks ok to me. From the IPsec point of view, I'm
> > > fine with this patch, but it also touches generic
> > > networking code. So I'd like to hear an opinion of one
> > > of the networking maintainers before proceeding.
> > 
> > I have not seen a Fixes: tag.
> 
> Right, we need a v2 with a Fixes tag, and maybe also
> 'Cc: stable@vger.kernel.org'
> 
> > Do we need to split this patch into two parts?
> 
> I don't think we need to spilt it, we can merge it
> either to the net or the ipsec tree. Both should
> be OK.

Also, please add:
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>


Best regards,
Hyunwoo Kim