crypto/Kconfig | 12 + crypto/Makefile | 1 + crypto/crypto_pool.c | 323 +++ include/crypto/pool.h | 33 + include/linux/sockptr.h | 23 + include/linux/tcp.h | 24 + include/net/dropreason.h | 25 + include/net/seg6_hmac.h | 7 - include/net/tcp.h | 193 +- include/net/tcp_ao.h | 283 +++ include/uapi/linux/snmp.h | 5 + include/uapi/linux/tcp.h | 62 + net/ipv4/Kconfig | 15 +- net/ipv4/Makefile | 1 + net/ipv4/proc.c | 5 + net/ipv4/tcp.c | 191 +- net/ipv4/tcp_ao.c | 1939 +++++++++++++++++ net/ipv4/tcp_input.c | 94 +- net/ipv4/tcp_ipv4.c | 385 +++- net/ipv4/tcp_minisocks.c | 37 +- net/ipv4/tcp_output.c | 188 +- net/ipv6/Kconfig | 2 +- net/ipv6/Makefile | 1 + net/ipv6/seg6.c | 3 - net/ipv6/seg6_hmac.c | 204 +- net/ipv6/tcp_ao.c | 151 ++ net/ipv6/tcp_ipv6.c | 327 ++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/tcp_ao/.gitignore | 2 + tools/testing/selftests/net/tcp_ao/Makefile | 50 + .../selftests/net/tcp_ao/bench-lookups.c | 403 ++++ .../selftests/net/tcp_ao/connect-deny.c | 217 ++ tools/testing/selftests/net/tcp_ao/connect.c | 81 + .../selftests/net/tcp_ao/icmps-accept.c | 1 + .../selftests/net/tcp_ao/icmps-discard.c | 447 ++++ .../testing/selftests/net/tcp_ao/lib/aolib.h | 333 +++ .../selftests/net/tcp_ao/lib/netlink.c | 341 +++ tools/testing/selftests/net/tcp_ao/lib/proc.c | 267 +++ .../testing/selftests/net/tcp_ao/lib/setup.c | 297 +++ tools/testing/selftests/net/tcp_ao/lib/sock.c | 294 +++ .../testing/selftests/net/tcp_ao/lib/utils.c | 30 + .../selftests/net/tcp_ao/setsockopt-closed.c | 191 ++ .../selftests/net/tcp_ao/unsigned-md5.c | 483 ++++ 43 files changed, 7516 insertions(+), 456 deletions(-) create mode 100644 crypto/crypto_pool.c create mode 100644 include/crypto/pool.h create mode 100644 include/net/tcp_ao.h create mode 100644 net/ipv4/tcp_ao.c create mode 100644 net/ipv6/tcp_ao.c create mode 100644 tools/testing/selftests/net/tcp_ao/.gitignore create mode 100644 tools/testing/selftests/net/tcp_ao/Makefile create mode 100644 tools/testing/selftests/net/tcp_ao/bench-lookups.c create mode 100644 tools/testing/selftests/net/tcp_ao/connect-deny.c create mode 100644 tools/testing/selftests/net/tcp_ao/connect.c create mode 120000 tools/testing/selftests/net/tcp_ao/icmps-accept.c create mode 100644 tools/testing/selftests/net/tcp_ao/icmps-discard.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/aolib.h create mode 100644 tools/testing/selftests/net/tcp_ao/lib/netlink.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/proc.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/setup.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/sock.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/utils.c create mode 100644 tools/testing/selftests/net/tcp_ao/setsockopt-closed.c create mode 100644 tools/testing/selftests/net/tcp_ao/unsigned-md5.c
This patchset implements the TCP-AO option as described in RFC5925. There is a request from industry to move away from TCP-MD5SIG and it seems the time is right to have a TCP-AO upstreamed. This TCP option is meant to replace the TCP MD5 option and address its shortcomings. Specifically, it provides more secure hashing, key rotation and support for long-lived connections (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). The patch series starts with six patches that are not specific to TCP-AO but implement a general crypto facility that we thought is useful to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other crypto users. These six patches are being submitted separately in a different patchset [1]. Including them here will show better the gain in code sharing. Next are 18 patches that implement the actual TCP-AO option, followed by patches implementing selftests. The patch set was written as a collaboration of three authors (in alphabetical order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional credits should be given to Prasad Koya, who was involved in early prototyping a few years back. There is also a separate submission done by Leonard Crestez whom we thank for his efforts getting an implementation of RFC5925 submitted for review upstream [2]. This is an independent implementation that makes different design decisions. For example, we chose a similar design to the TCP-MD5SIG implementation and used setsockopt()s to program per-socket keys, avoiding the extra complexity of managing a centralized key database in the kernel. A centralized database in the kernel has dubious benefits since it doesn’t eliminate per-socket setsockopts needed to specify which sockets need TCP-AO and what are the currently preferred keys. It also complicates traffic key caching and preventing deletion of in-use keys. In this implementation, a centralized database of keys can be thought of as living in user space and user applications would have to program those keys on matching sockets. On the server side, the user application programs keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that are expected to connect. Prefix matching on the peer address is supported. When a peer issues a successful connect, all the MKTs matching the IP address of the peer are copied to the newly created socket. On the active side, when a connect() is issued all MKTs that do not match the peer are deleted from the socket since they will never match the peer. This implementation uses three setsockopt()s for adding, deleting and modifying keys on a socket. All three setsockopt()s have extensive sanity checks that prevent inconsistencies in the keys on a given socket. A getsockopt() is provided to get key information from any given socket. Few things to note about this implementation: - Traffic keys are cached for established connections avoiding the cost of such calculation for each packet received or sent. - Great care has been taken to avoid deleting in-use MKTs as required by the RFC. - Any crypto algorithm supported by the Linux kernel can be used to calculate packet hashes. - Fastopen works with TCP-AO but hasn’t been tested extensively. - Tested for interop with other major networking vendors (on linux-4.19), including testing for key rotation and long lived connections. There are a couple of limitations that we’re aware of, including (but not limited to) the following: - setsockopt(TCP_REPAIR) not supported yet - IPv4-mapped-IPv6 addresses not tested - static key not implemented yet - CONFIG_TCP_AO depends on CONFIG_TCP_MD5SIG - A small window for a race condition exists between accept and key adding/deletion on a listening socket but can be easily overcome by using the getsockopt() to make sure the right keys are there on a newly accepted connection - Key matching by TCP port numbers, peer ranges, asterisks is unsupported as it’s unlikely to be useful [1]: https://lore.kernel.org/all/20220726201600.1715505-1-dima@arista.com/ [2]: https://lore.kernel.org/all/cover.1658815925.git.cdleonard@gmail.com/ Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Bob Gilligan <gilligan@arista.com> Cc: David Ahern <dsahern@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri@arista.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Ivan Delalande <colona@arista.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leonard Crestez <cdleonard@gmail.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Salam Noureddine <noureddine@arista.com> Cc: Shuah Khan <shuah@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Dmitry Safonov (31): crypto: Introduce crypto_pool crypto_pool: Add crypto_pool_reserve_scratch() net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add() net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction net/tcp: Use crypto_pool for TCP-MD5 net/ipv6: sr: Switch to using crypto_pool tcp: Add TCP-AO config and structures net/tcp: Introduce TCP_AO setsockopt()s net/tcp: Prevent TCP-MD5 with TCP-AO being set net/tcp: Calculate TCP-AO traffic keys net/tcp: Add TCP-AO sign to outgoing packets net/tcp: Add tcp_parse_auth_options() net/tcp: Add AO sign to RST packets net/tcp: Add TCP-AO sign to twsk net/tcp: Wire TCP-AO to request sockets net/tcp: Sign SYN-ACK segments with TCP-AO net/tcp: Verify inbound TCP-AO signed segments net/tcp: Add TCP-AO segments counters net/tcp: Add TCP-AO SNE support net/tcp: Add tcp_hash_fail() ratelimited logs net/tcp: Ignore specific ICMPs for TCP-AO connections net/tcp: Add option for TCP-AO to (not) hash header net/tcp: Add getsockopt(TCP_AO_GET) net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs) selftests/net: Add TCP-AO library selftests/net: Verify that TCP-AO complies with ignoring ICMPs selftest/net: Add TCP-AO ICMPs accept test selftest/tcp-ao: Add a test for MKT matching selftest/tcp-ao: Add test for TCP-AO add setsockopt() command selftests/tcp-ao: Add TCP-AO + TCP-MD5 + no sign listen socket tests selftests/aolib: Add test/benchmark for removing MKTs crypto/Kconfig | 12 + crypto/Makefile | 1 + crypto/crypto_pool.c | 323 +++ include/crypto/pool.h | 33 + include/linux/sockptr.h | 23 + include/linux/tcp.h | 24 + include/net/dropreason.h | 25 + include/net/seg6_hmac.h | 7 - include/net/tcp.h | 193 +- include/net/tcp_ao.h | 283 +++ include/uapi/linux/snmp.h | 5 + include/uapi/linux/tcp.h | 62 + net/ipv4/Kconfig | 15 +- net/ipv4/Makefile | 1 + net/ipv4/proc.c | 5 + net/ipv4/tcp.c | 191 +- net/ipv4/tcp_ao.c | 1939 +++++++++++++++++ net/ipv4/tcp_input.c | 94 +- net/ipv4/tcp_ipv4.c | 385 +++- net/ipv4/tcp_minisocks.c | 37 +- net/ipv4/tcp_output.c | 188 +- net/ipv6/Kconfig | 2 +- net/ipv6/Makefile | 1 + net/ipv6/seg6.c | 3 - net/ipv6/seg6_hmac.c | 204 +- net/ipv6/tcp_ao.c | 151 ++ net/ipv6/tcp_ipv6.c | 327 ++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/tcp_ao/.gitignore | 2 + tools/testing/selftests/net/tcp_ao/Makefile | 50 + .../selftests/net/tcp_ao/bench-lookups.c | 403 ++++ .../selftests/net/tcp_ao/connect-deny.c | 217 ++ tools/testing/selftests/net/tcp_ao/connect.c | 81 + .../selftests/net/tcp_ao/icmps-accept.c | 1 + .../selftests/net/tcp_ao/icmps-discard.c | 447 ++++ .../testing/selftests/net/tcp_ao/lib/aolib.h | 333 +++ .../selftests/net/tcp_ao/lib/netlink.c | 341 +++ tools/testing/selftests/net/tcp_ao/lib/proc.c | 267 +++ .../testing/selftests/net/tcp_ao/lib/setup.c | 297 +++ tools/testing/selftests/net/tcp_ao/lib/sock.c | 294 +++ .../testing/selftests/net/tcp_ao/lib/utils.c | 30 + .../selftests/net/tcp_ao/setsockopt-closed.c | 191 ++ .../selftests/net/tcp_ao/unsigned-md5.c | 483 ++++ 43 files changed, 7516 insertions(+), 456 deletions(-) create mode 100644 crypto/crypto_pool.c create mode 100644 include/crypto/pool.h create mode 100644 include/net/tcp_ao.h create mode 100644 net/ipv4/tcp_ao.c create mode 100644 net/ipv6/tcp_ao.c create mode 100644 tools/testing/selftests/net/tcp_ao/.gitignore create mode 100644 tools/testing/selftests/net/tcp_ao/Makefile create mode 100644 tools/testing/selftests/net/tcp_ao/bench-lookups.c create mode 100644 tools/testing/selftests/net/tcp_ao/connect-deny.c create mode 100644 tools/testing/selftests/net/tcp_ao/connect.c create mode 120000 tools/testing/selftests/net/tcp_ao/icmps-accept.c create mode 100644 tools/testing/selftests/net/tcp_ao/icmps-discard.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/aolib.h create mode 100644 tools/testing/selftests/net/tcp_ao/lib/netlink.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/proc.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/setup.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/sock.c create mode 100644 tools/testing/selftests/net/tcp_ao/lib/utils.c create mode 100644 tools/testing/selftests/net/tcp_ao/setsockopt-closed.c create mode 100644 tools/testing/selftests/net/tcp_ao/unsigned-md5.c base-commit: e34cfee65ec891a319ce79797dda18083af33a76 -- 2.37.2
On 8/18/22 19:59, Dmitry Safonov wrote: > This patchset implements the TCP-AO option as described in RFC5925. There > is a request from industry to move away from TCP-MD5SIG and it seems the time > is right to have a TCP-AO upstreamed. This TCP option is meant to replace > the TCP MD5 option and address its shortcomings. Specifically, it provides > more secure hashing, key rotation and support for long-lived connections > (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). > The patch series starts with six patches that are not specific to TCP-AO > but implement a general crypto facility that we thought is useful > to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other > crypto users. These six patches are being submitted separately in > a different patchset [1]. Including them here will show better the gain > in code sharing. Next are 18 patches that implement the actual TCP-AO option, > followed by patches implementing selftests. > > The patch set was written as a collaboration of three authors (in alphabetical > order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional > credits should be given to Prasad Koya, who was involved in early prototyping > a few years back. There is also a separate submission done by Leonard Crestez > whom we thank for his efforts getting an implementation of RFC5925 submitted > for review upstream [2]. This is an independent implementation that makes > different design decisions. Is this based on something that Arista has had running for a while now or is a recent new development? > For example, we chose a similar design to the TCP-MD5SIG implementation and > used setsockopt()s to program per-socket keys, avoiding the extra complexity > of managing a centralized key database in the kernel. A centralized database > in the kernel has dubious benefits since it doesn’t eliminate per-socket > setsockopts needed to specify which sockets need TCP-AO and what are the > currently preferred keys. It also complicates traffic key caching and > preventing deletion of in-use keys. My implementation started with per-socket lists but switched to a global list because this way is much easier to manage from userspace. In practice userspace apps will want to ensure that all sockets use the same set of keys anyway. > In this implementation, a centralized database of keys can be thought of > as living in user space and user applications would have to program those > keys on matching sockets. On the server side, the user application programs > keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that > are expected to connect. Prefix matching on the peer address is supported. > When a peer issues a successful connect, all the MKTs matching the IP address > of the peer are copied to the newly created socket. On the active side, > when a connect() is issued all MKTs that do not match the peer are deleted > from the socket since they will never match the peer. This implementation > uses three setsockopt()s for adding, deleting and modifying keys on a socket. > All three setsockopt()s have extensive sanity checks that prevent > inconsistencies in the keys on a given socket. A getsockopt() is provided > to get key information from any given socket. My series doesn't try to prevent inconsistencies inside the key lists because it's not clear that the kernel should prevent userspace from shooting itself in the foot. Worst case is connection failure on misconfiguration which seems fine. The RFC doesn't specify in detail how key management is to be performed, for example if two valid keys are available it doesn't mention which one should be used. Some guidance is found in RFC8177 but again not very much. I implemented an ABI that can be used by userspace for RFC8177-style key management and asked for feedback but received very little. If you had come with a clear ABI proposal I would have tried to implement it. Here's a link to our older discussion: https://lore.kernel.org/netdev/e7f0449a-2bad-99ad-4737-016a0e6b8b84@gmail.com/ Seeing an entirely distinct unrelated implementation is very unexpected. What made you do this? -- Regards, Leonard
On Sun, Aug 21, 2022 at 1:34 PM Leonard Crestez <cdleonard@gmail.com> wrote: > > On 8/18/22 19:59, Dmitry Safonov wrote: > > This patchset implements the TCP-AO option as described in RFC5925. There > > is a request from industry to move away from TCP-MD5SIG and it seems the time > > is right to have a TCP-AO upstreamed. This TCP option is meant to replace > > the TCP MD5 option and address its shortcomings. ... > > > > The patch set was written as a collaboration of three authors (in alphabetical > > order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional > > credits should be given to Prasad Koya, who was involved in early prototyping > > a few years back. There is also a separate submission done by Leonard Crestez > > whom we thank for his efforts getting an implementation of RFC5925 submitted > > for review upstream [2]. This is an independent implementation that makes > > different design decisions. > > Is this based on something that Arista has had running for a while now > or is a recent new development? > This is based on prototype code we had worked on internally three years ago. The implementation effort was restarted to get it over the finish line. For business reasons we had to have our own implementation ready and not tied to unmerged upstream code. Please also note that our implementation is based on linux-4.19 and was only ported forward to mainline recently. So it wasn’t ready to be submitted upstream. > > For example, we chose a similar design to the TCP-MD5SIG implementation and > > used setsockopt()s to program per-socket keys, avoiding the extra complexity > > of managing a centralized key database in the kernel. A centralized database > > in the kernel has dubious benefits since it doesn’t eliminate per-socket > > setsockopts needed to specify which sockets need TCP-AO and what are the > > currently preferred keys. It also complicates traffic key caching and > > preventing deletion of in-use keys. > > My implementation started with per-socket lists but switched to a global > list because this way is much easier to manage from userspace. In > practice userspace apps will want to ensure that all sockets use the > same set of keys anyway. > We did consider a global list early on but we didn’t find it beneficial. We still believe that per-socket lists reduce complexity of the implementation, are more scalable and ensure predictable behavior. Our expectation is that TCP-AO will be only useful for a limited set of routing applications, rather than used transparently like IPSEC for non-routing apps. We would be happy to discuss this in more detail. > > In this implementation, a centralized database of keys can be thought of > > as living in user space and user applications would have to program those > > keys on matching sockets. On the server side, the user application programs > > keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that > > are expected to connect. Prefix matching on the peer address is supported. ... > > My series doesn't try to prevent inconsistencies inside the key lists > because it's not clear that the kernel should prevent userspace from > shooting itself in the foot. Worst case is connection failure on > misconfiguration which seems fine. > > The RFC doesn't specify in detail how key management is to be performed, > for example if two valid keys are available it doesn't mention which one > should be used. Some guidance is found in RFC8177 but again not very much. > > I implemented an ABI that can be used by userspace for RFC8177-style key > management and asked for feedback but received very little. If you had > come with a clear ABI proposal I would have tried to implement it. > > Here's a link to our older discussion: > > https://lore.kernel.org/netdev/e7f0449a-2bad-99ad-4737-016a0e6b8b84@gmail.com/ > > Seeing an entirely distinct unrelated implementation is very unexpected. > What made you do this? > > -- > Regards, > Leonard Our goal was not to have a competing TCP-AO upstream submission but to implement the RFC for our customers to use. Had there been an already upstreamed implementation we would have used it and implemented customer requirements on top of it. Just like we do with all other kernel features. This is not a bad situation, we believe it is good for the upstream community to have two fully functioning implementations to consider. Possibly a third collaborative submission might emerge that takes the best of both. A year ago, there wasn’t much available online about TCP-AO besides the RFC. We are excited with the current interest in TCP-AO and hope to see it upstreamed soon. Best, Salam
On 8/21/22 2:34 PM, Leonard Crestez wrote: > On 8/18/22 19:59, Dmitry Safonov wrote: >> This patchset implements the TCP-AO option as described in RFC5925. There >> is a request from industry to move away from TCP-MD5SIG and it seems >> the time >> is right to have a TCP-AO upstreamed. This TCP option is meant to replace >> the TCP MD5 option and address its shortcomings. Specifically, it >> provides >> more secure hashing, key rotation and support for long-lived connections >> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). >> The patch series starts with six patches that are not specific to TCP-AO >> but implement a general crypto facility that we thought is useful >> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as >> other >> crypto users. These six patches are being submitted separately in >> a different patchset [1]. Including them here will show better the gain >> in code sharing. Next are 18 patches that implement the actual TCP-AO >> option, >> followed by patches implementing selftests. >> >> The patch set was written as a collaboration of three authors (in >> alphabetical >> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. >> Additional >> credits should be given to Prasad Koya, who was involved in early >> prototyping >> a few years back. There is also a separate submission done by Leonard >> Crestez >> whom we thank for his efforts getting an implementation of RFC5925 >> submitted >> for review upstream [2]. This is an independent implementation that makes >> different design decisions. > > Is this based on something that Arista has had running for a while now > or is a recent new development? > ... > Seeing an entirely distinct unrelated implementation is very unexpected. > What made you do this? > I am curious as well. You are well aware of Leonard's efforts which go back a long time, why go off and do a separate implementation?
Hi Leonard, David,
On 8/22/22 00:51, David Ahern wrote:
> On 8/21/22 2:34 PM, Leonard Crestez wrote:
>> On 8/18/22 19:59, Dmitry Safonov wrote:
>>> This patchset implements the TCP-AO option as described in RFC5925. There
>>> is a request from industry to move away from TCP-MD5SIG and it seems
>>> the time
>>> is right to have a TCP-AO upstreamed. This TCP option is meant to replace
>>> the TCP MD5 option and address its shortcomings. Specifically, it
>>> provides
>>> more secure hashing, key rotation and support for long-lived connections
>>> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
>>> The patch series starts with six patches that are not specific to TCP-AO
>>> but implement a general crypto facility that we thought is useful
>>> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as
>>> other
>>> crypto users. These six patches are being submitted separately in
>>> a different patchset [1]. Including them here will show better the gain
>>> in code sharing. Next are 18 patches that implement the actual TCP-AO
>>> option,
>>> followed by patches implementing selftests.
>>>
>>> The patch set was written as a collaboration of three authors (in
>>> alphabetical
>>> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine.
>>> Additional
>>> credits should be given to Prasad Koya, who was involved in early
>>> prototyping
>>> a few years back. There is also a separate submission done by Leonard
>>> Crestez
>>> whom we thank for his efforts getting an implementation of RFC5925
>>> submitted
>>> for review upstream [2]. This is an independent implementation that makes
>>> different design decisions.
>>
>> Is this based on something that Arista has had running for a while now
>> or is a recent new development?
>>
>
> ...
>
>> Seeing an entirely distinct unrelated implementation is very unexpected.
>> What made you do this?
>>
>
> I am curious as well. You are well aware of Leonard's efforts which go
> back a long time, why go off and do a separate implementation?
When I started working on this, there was a prototype that was neither
good for upstream, nor for customers. At the moment Leonard submitted
his RFC, I was already giving feedback/reviews to local code and
patches. So, as I was aware of the details of TCP-AO, I started giving
Leonard feedback and reviews, based on what I've learned from RFC/code.
I thought whatever code will make it upstream, it can benefit from my
reviews. Some of my comments were based on a better code I saw locally,
or a way of improving it that I've suggested to both sides.
I'm quite happy that Leonard addressed some of my comments (i.e.
extendable syscalls), I see that it improved his patches.
On the other hand, some of the comments I've left moved to "known
limitations" with no foreseeable plan to fix them, while they were
addressed in local/Arista code.
And now a little bit more than a year later, it seems that the quality
of local patches has reached a point where they can be submitted for
an upstream review. So, please don't misunderstand me, it's not that
"drop your implementation, take ours" and it's not that we've
intentionally hidden that we're working on TCP-AO. It's that it is the
first moment we can make upstream aware of an alternative implementation.
Personally, I think it's best for opensource community:
- Arista's implementation is now public
- there are now at least 4 people that understand RFC5925 and the
code/details
- in a discussion, it will be possible to find what will be the best
from both implementations for Linux and come up with better code
At this particular moment, it seems neither of patch sets is ready to be
merged "as-is". But it seems that there's enough interest from both
sides and likely it guarantees that there will be enough effort to make
something merge-able, that will work for all interested parties.
As for my part, I'm interested in the best code upstream, regardless who
is the author. This includes:
- reusing the existing TCP-MD5 code, rather than copying'n'pasting for
TCP-AO with intent to refactor it some day later
- making setsockopt()s and other syscalls extendable
- cover functionality with selftests
- following RFC5925 in implementation, especially "required" and "must"
parts
I hope that clarifies how and why now there are two patch sets that
implement the same RFC/functionality.
Thanks,
Dmitry
On 8/22/22 23:35, Dmitry Safonov wrote:
> Hi Leonard, David,
>
> On 8/22/22 00:51, David Ahern wrote:
>> On 8/21/22 2:34 PM, Leonard Crestez wrote:
>>> On 8/18/22 19:59, Dmitry Safonov wrote:
>>>> This patchset implements the TCP-AO option as described in RFC5925. There
>>>> is a request from industry to move away from TCP-MD5SIG and it seems
>>>> the time
>>>> is right to have a TCP-AO upstreamed. This TCP option is meant to replace
>>>> the TCP MD5 option and address its shortcomings. Specifically, it
>>>> provides
>>>> more secure hashing, key rotation and support for long-lived connections
>>>> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
>>>> The patch series starts with six patches that are not specific to TCP-AO
>>>> but implement a general crypto facility that we thought is useful
>>>> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as
>>>> other
>>>> crypto users. These six patches are being submitted separately in
>>>> a different patchset [1]. Including them here will show better the gain
>>>> in code sharing. Next are 18 patches that implement the actual TCP-AO
>>>> option,
>>>> followed by patches implementing selftests.
>>>>
>>>> The patch set was written as a collaboration of three authors (in
>>>> alphabetical
>>>> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine.
>>>> Additional
>>>> credits should be given to Prasad Koya, who was involved in early
>>>> prototyping
>>>> a few years back. There is also a separate submission done by Leonard
>>>> Crestez
>>>> whom we thank for his efforts getting an implementation of RFC5925
>>>> submitted
>>>> for review upstream [2]. This is an independent implementation that makes
>>>> different design decisions.
>>>
>>> Is this based on something that Arista has had running for a while now
>>> or is a recent new development?
>>>
>>
>> ...
>>
>>> Seeing an entirely distinct unrelated implementation is very unexpected.
>>> What made you do this?
>>>
>>
>> I am curious as well. You are well aware of Leonard's efforts which go
>> back a long time, why go off and do a separate implementation?
>
> When I started working on this, there was a prototype that was neither
> good for upstream, nor for customers. At the moment Leonard submitted
> his RFC, I was already giving feedback/reviews to local code and
> patches. So, as I was aware of the details of TCP-AO, I started giving
> Leonard feedback and reviews, based on what I've learned from RFC/code.
> I thought whatever code will make it upstream, it can benefit from my
> reviews. Some of my comments were based on a better code I saw locally,
> or a way of improving it that I've suggested to both sides.
>
> I'm quite happy that Leonard addressed some of my comments (i.e.
> extendable syscalls), I see that it improved his patches.
> On the other hand, some of the comments I've left moved to "known
> limitations" with no foreseeable plan to fix them, while they were
> addressed in local/Arista code.
>
> And now a little bit more than a year later, it seems that the quality
> of local patches has reached a point where they can be submitted for
> an upstream review. So, please don't misunderstand me, it's not that
> "drop your implementation, take ours" and it's not that we've
> intentionally hidden that we're working on TCP-AO. It's that it is the
> first moment we can make upstream aware of an alternative implementation.
>
> Personally, I think it's best for opensource community:
> - Arista's implementation is now public
> - there are now at least 4 people that understand RFC5925 and the
> code/details
> - in a discussion, it will be possible to find what will be the best
> from both implementations for Linux and come up with better code
>
> At this particular moment, it seems neither of patch sets is ready to be
> merged "as-is". But it seems that there's enough interest from both
> sides and likely it guarantees that there will be enough effort to make
> something merge-able, that will work for all interested parties.
>
> As for my part, I'm interested in the best code upstream, regardless who
> is the author. This includes:
> - reusing the existing TCP-MD5 code, rather than copying'n'pasting for
> TCP-AO with intent to refactor it some day later
I had a requirement to deploy on linux 5.4 so I very deliberately
avoided touching MD5. I'm not sure there very much duplication anyway.
> - making setsockopt()s and other syscalls extendable
> - cover functionality with selftests
My implementation is tested with a standalone python package, this is a
design choice which doesn't particularly matter.
> - following RFC5925 in implementation, especially "required" and "must"
> parts
I'm not convinced that "don't delete current key" needs to be literally
interpreted as a hard requirement for the linux ABI. Most TCP RFCs don't
specify any sort of API at all and it would be entirely valid to
implement BGP-TCP-AO as a single executable with no internally
documented boundaries.
> I hope that clarifies how and why now there are two patch sets that
> implement the same RFC/functionality.
As far as I can tell the biggest problem is that is quite difficult to
implement the userspace side of TCP-AO complete with key rollover. Our
two implementation both claim to support this but through different ABI
and both require active management from userspace.
I think it would make sense to push key validity times and the key
selection policy entirely in the kernel so that it can handle key
rotation/expiration by itself. This way userspace only has to configure
the keys and doesn't have to touch established connections at all.
My series has a "flags" field on the key struct where it can filter by
IP, prefix, ifindex and so on. It would be possible to add additional
flags for making the key only valid between certain times (by wall time).
The kernel could then make key selections itself:
- send rnextkeyid based on the key with the longest recv lifetime
- send keyid based on remote rnextkeyid.
- If not applicable (rnextkeyid not found locally, or for SYN) pick
based on longest send lifetime.
- If all keys expire then return an error on write()
- Solve other ambiguities in a predictable way: if valid times are
equal then pick the lowest numeric send_id or recv_id.
Explicit key selection from userspace could still be supported but it
would be optional and most apps wouldn't bother implementing their own
policy. The biggest advantage is that it would be much easier for
applications to adopt TCP-AO.
--
Regards,
Leonard
> I think it would make sense to push key validity times and the key selection
> policy entirely in the kernel so that it can handle key rotation/expiration
> by itself. This way userspace only has to configure the keys and doesn't
> have to touch established connections at all.
I know nothing aobut TCP-AO, nor much about kTLS. But doesn't kTLS
have the same issue? Is there anything which can be learnt from kTLS?
Maybe the same mechanisms can be used? No point inventing something
new if you can copy/refactor working code?
> My series has a "flags" field on the key struct where it can filter by IP,
> prefix, ifindex and so on. It would be possible to add additional flags for
> making the key only valid between certain times (by wall time).
What out for wall clock time, it jumps around in funny ways. Plus the
kernel has no idea what time zone the wall the wall clock is mounted
on is in.
Andrew
On 8/24/22 15:46, Andrew Lunn wrote: >> I think it would make sense to push key validity times and the key selection >> policy entirely in the kernel so that it can handle key rotation/expiration >> by itself. This way userspace only has to configure the keys and doesn't >> have to touch established connections at all. > > I know nothing aobut TCP-AO, nor much about kTLS. But doesn't kTLS > have the same issue? Is there anything which can be learnt from kTLS? > Maybe the same mechanisms can be used? No point inventing something > new if you can copy/refactor working code? > >> My series has a "flags" field on the key struct where it can filter by IP, >> prefix, ifindex and so on. It would be possible to add additional flags for >> making the key only valid between certain times (by wall time). > > What out for wall clock time, it jumps around in funny ways. Plus the > kernel has no idea what time zone the wall the wall clock is mounted > on is in. A close equivalent seems to exist in ipsec in the "xfrm_lifetime_cfg" struct, specifically the soft/hard expires timers. These are optional validity times for each xfrm_state which is equivalent to a "key". I'm not familiar with how those are used but ipsec usually relies on complex userspace daemons for managing xfrm states and policies and those daemons should be capable of adding and removing keys based on internal timers. Still, the linux kernel supports checking for key validity on it's own. -- Regards, Leonard
On Wed, 24 Aug 2022 14:46:32 +0200 Andrew Lunn wrote: > > I think it would make sense to push key validity times and the key selection > > policy entirely in the kernel so that it can handle key rotation/expiration > > by itself. This way userspace only has to configure the keys and doesn't > > have to touch established connections at all. > > I know nothing aobut TCP-AO, nor much about kTLS. But doesn't kTLS > have the same issue? Is there anything which can be learnt from kTLS? > Maybe the same mechanisms can be used? No point inventing something > new if you can copy/refactor working code? kTLS does not support key rotation FWIW. It's extremely rare. > > My series has a "flags" field on the key struct where it can filter by IP, > > prefix, ifindex and so on. It would be possible to add additional flags for > > making the key only valid between certain times (by wall time). > > What out for wall clock time, it jumps around in funny ways. Plus the > kernel has no idea what time zone the wall the wall clock is mounted > on is in. I'd do all of this over netlink, next-gen crypto on sockets is *the* reason I started the YAML work. Sadly my crypto config via netlink does not exist yet and is probably 2mo out ;(
On 8/23/22 16:30, Leonard Crestez wrote:
> On 8/22/22 23:35, Dmitry Safonov wrote:
>> Hi Leonard, David,
[..]
>> At this particular moment, it seems neither of patch sets is ready to be
>> merged "as-is". But it seems that there's enough interest from both
>> sides and likely it guarantees that there will be enough effort to make
>> something merge-able, that will work for all interested parties.
>>
>> As for my part, I'm interested in the best code upstream, regardless who
>> is the author. This includes:
>> - reusing the existing TCP-MD5 code, rather than copying'n'pasting for
>> TCP-AO with intent to refactor it some day later
>
> I had a requirement to deploy on linux 5.4 so I very deliberately
> avoided touching MD5. I'm not sure there very much duplication anyway.
Yeah, I know what you mean: we deployed it on v4.19. But for the code
upstream I personally prefer to see "reusing" rather than copying.
Lesser code is easier to maintain in future.
Upstream submissions in my view should be based on "what would be easier
to maintain in future", rather than on "what would be easier to backport
to my maintenance release".
>> - making setsockopt()s and other syscalls extendable
>> - cover functionality with selftests
>
> My implementation is tested with a standalone python package, this is a
> design choice which doesn't particularly matter.
>
>> - following RFC5925 in implementation, especially "required" and "must"
>> parts
>
> I'm not convinced that "don't delete current key" needs to be literally
> interpreted as a hard requirement for the linux ABI. Most TCP RFCs don't
> specify any sort of API at all and it would be entirely valid to
> implement BGP-TCP-AO as a single executable with no internally
> documented boundaries.
I agree that RFC requirements and "musts" can be implemented in
userspace, rather than in kernel. On the other hand, my opinion is that
if you have "must"/"must not"/"required" in RFC and it's not hard to
limit those in kernel, than you _should_ do it.
In this point of view, debugging "hey, setsockopt() for key removal
returned -EBUSY, what's going on?" is better than "hey, tcp connection
died on my side and I didn't have tcp dump running, what was that?".
>> I hope that clarifies how and why now there are two patch sets that
>> implement the same RFC/functionality.
>
> As far as I can tell the biggest problem is that is quite difficult to
> implement the userspace side of TCP-AO complete with key rollover. Our
> two implementation both claim to support this but through different ABI
> and both require active management from userspace.
>
> I think it would make sense to push key validity times and the key
> selection policy entirely in the kernel so that it can handle key
> rotation/expiration by itself. This way userspace only has to configure
> the keys and doesn't have to touch established connections at all.
Respectfully I disagree here. I think all such policies should be
implemented in userspace. The kernel has to have as lesser as possible,
but enough to provide a way to sign, verify, log messages on TCP segments.
All the logic that may change, all business decisions and key management
should be implemented in userspace, keeping the kernel part as easier in
"KISS" sense as possible.
> My series has a "flags" field on the key struct where it can filter by
> IP, prefix, ifindex and so on. It would be possible to add additional
> flags for making the key only valid between certain times (by wall time).
>
> The kernel could then make key selections itself:
> - send rnextkeyid based on the key with the longest recv lifetime
> - send keyid based on remote rnextkeyid.
> - If not applicable (rnextkeyid not found locally, or for SYN) pick
> based on longest send lifetime.
> - If all keys expire then return an error on write()
> - Solve other ambiguities in a predictable way: if valid times are
> equal then pick the lowest numeric send_id or recv_id.
>
> Explicit key selection from userspace could still be supported but it
> would be optional and most apps wouldn't bother implementing their own
> policy. The biggest advantage is that it would be much easier for
> applications to adopt TCP-AO.
Personally, I would think that all you mentioned better stay in
userspace app. The kernel should do as minimal and as much predictable
as possible job here, without 10 possible outcomes. If you want to share
the logic of key rotation/expiration, all timers and synchronization
between different BGP applications, that would be a proper job for a
shared library.
Thanks,
Dmitry
© 2016 - 2026 Red Hat, Inc.