-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

NEFARIOUSPLAN-CANONICAL-V1
{"body_md":"On May 5, 2026, commit `f4c50a4034e62ab75f1d5cdd191dd5f9c77fdff4` landed in the upstream Linux kernel. Author Kuan-Ting Chen, committer Steffen Klassert. Title: `xfrm: esp: avoid in-place decrypt on shared skb frags`. The first paragraph of the message is the diagnosis:\n\n> MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(), so later paths that may modify packet data can first make a private copy. The IPv4/IPv6 datagram append paths did not set this flag when splicing pages into UDP skbs.\n>\n> That leaves an ESP-in-UDP packet made from shared pipe pages looking like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW fast path for uncloned skbs without a frag_list and decrypts in place over data that is not owned privately by the skb.\n\nThe kernel had the trust boundary already. It had the metadata, it had the convention, and it had two receiver paths checking the metadata. The transport that wrote the convention down (TCP) set the flag. The transport that ESP runs over (UDP) did not. CVE-2026-43284 is the gap between those two facts. CVE-2026-43500 is the same gap in RxRPC, where the receiver gates on `skb_cloned(skb)`, and a paged-frag skb is not cloned.\n\n## What the receiver does, and why the fast path exists\n\n`esp_input()` decrypts ESP-encapsulated payloads. AEAD decryption needs a scatter-gather list for the ciphertext; the list is built directly from the skb's frags by `skb_to_sgvec()`. `aead_request_set_crypt(req, sg, sg, ...)` is then invoked with src and dst pointing at the same SGL, and the function decrypts the payload in place.\n\nIn place is correct when the kernel owns the pages. `skb_cow_data()` exists to enforce that, by copying any externally backed frags into kernel-private memory before crypto runs. Until 2017 every ESP receive path called it. Then commits `cac2661c53f3 (\"esp4: Avoid skb_cow_data whenever possible\")` and `03e2a30f6a27 (\"esp6: Avoid skb_cow_data whenever possible\")` introduced a fast path. The post-patch source today reads:\n\n```c\nstatic int esp_input(struct xfrm_state *x, struct sk_buff *skb)\n{\n    [...]\n    if (!skb_cloned(skb)) {\n        if (!skb_is_nonlinear(skb)) {\n            nfrags = 1;\n            goto skip_cow;\n        } else if (!skb_has_frag_list(skb) &&\n                   !skb_has_shared_frag(skb)) {\n            nfrags = skb_shinfo(skb)->nr_frags;\n            nfrags++;\n            goto skip_cow;\n        }\n    }\n    err = skb_cow_data(skb, 0, &trailer);\n```\n\nThe fast path triggers when an skb is uncloned and either fully linear or nonlinear without a `frag_list`. The `!skb_has_shared_frag(skb)` clause on the second branch is the patch. Pre-patch, the only condition was `!skb_has_frag_list(skb)`, and an attacker-controlled UDP packet whose frags were backed by foreign pages took `goto skip_cow` and decrypted in place over those foreign pages.\n\n## How the page gets planted\n\nV4bel's PoC at `V4bel/dirtyfrag` builds the malicious skb in three syscalls per write:\n\n```c\nuint8_t hdr[24];\n*(uint32_t *)(hdr + 0) = htonl(spi);\n*(uint32_t *)(hdr + 4) = htonl(SEQ_VAL);\nmemset(hdr + 8, 0xCC, 16);\n\nvmsplice(pfd[1], &(struct iovec){hdr, 24}, 1, 0);\nsplice(file_fd, &off, pfd[1], NULL, 16, SPLICE_F_MOVE);\nsplice(pfd[0], NULL, sk_send, NULL, 24 + 16, SPLICE_F_MOVE);\n```\n\n`file_fd` is a read-only handle on `/usr/bin/su`. The first `splice` moves a sixteen-byte chunk of su's page cache into a pipe. The second sends a UDP datagram whose linear head is the twenty-four-byte fake ESP header and whose first frag is su's page-cache page. The receiver is loopback. The path is `udp_rcv` to `xfrm4_udp_encap_rcv` to `xfrm_input` to `esp_input`. `esp_input` takes `skip_cow`. `crypto_aead_decrypt` runs over an SGL whose page is the in-memory image of `/usr/bin/su`. The AEAD template is `authencesn`, which performs a four-byte relocation of authenticated associated data into the ciphertext region as part of legitimate scratch. Those four bytes land in the page-cache page of su.\n\nThe exploit repeats for each four-byte window of an embedded shellcode. When done, `/usr/bin/su` looks unchanged on disk; its in-memory image is a setuid-root shellcode. The next process to exec it inherits root.\n\n## RxRPC has the same shape\n\nThe second CVE in the chain, CVE-2026-43500, lives in `rxkad_verify_packet_1()`:\n\n```c\nstatic int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,\n                                 rxrpc_seq_t seq, struct skcipher_request *req)\n{\n    sg_init_table(sg, ARRAY_SIZE(sg));\n    ret = skb_to_sgvec(skb, sg, sp->offset, 8);\n    memset(&iv, 0, sizeof(iv));\n    skcipher_request_set_crypt(req, sg, sg, 8, iv.x);\n    ret = crypto_skcipher_decrypt(req);\n```\n\nSame `skb_to_sgvec`. Same `sg, sg` in-place setup. Same in-place crypto. Eight bytes per write instead of four, because rxkad's packet check decrypts an eight-byte block.\n\nV4bel's submitted patch (Message-ID `<afKV2zGR6rrelPC7@v4bel>`, sent to netdev on 2026-04-30, not yet merged at time of writing) is two two-line changes, one in `rxrpc_input_call_event()` and one in `rxrpc_verify_response()`:\n\n```diff\n-\t\t    skb_cloned(skb)) {\n+\t\t    (skb_cloned(skb) || skb->data_len)) {\n \t\t\t/* Unshare the packet so that it can be\n \t\t\t * modified by in-place decryption.\n \t\t\t */\n```\n\nThe Fixes: tag is `d0d5c0cd1e71 (\"rxrpc: Use skb_unshare() rather than skb_cow_data()\")`. RxRPC ran the same optimization that ESP ran. It replaced `skb_cow_data()` with `skb_unshare()`, gated on `skb_cloned(skb)`. An uncloned skb carrying paged frags fell through.\n\n## Five Fixes: tags, none of them locally wrong\n\nThe ESP commit cites four:\n\n```\nFixes: cac2661c53f3 (\"esp4: Avoid skb_cow_data whenever possible\")\nFixes: 03e2a30f6a27 (\"esp6: Avoid skb_cow_data whenever possible\")\nFixes: 7da0dde68486 (\"ip, udp: Support MSG_SPLICE_PAGES\")\nFixes: 6d8192bd69bb (\"ip6, udp6: Support MSG_SPLICE_PAGES\")\n```\n\nThe RxRPC patch cites a fifth, the same shape as the first two: `Fixes: d0d5c0cd1e71 (\"rxrpc: Use skb_unshare() rather than skb_cow_data()\")`.\n\nThree of the five are commits whose subject lines describe removing the defensive copy from a receive path. Two of the five are commits that introduced MSG_SPLICE_PAGES support on UDP, the primitive that lets userspace inject pipe pages into a UDP skb. None of the five was wrong locally. The optimization commits were correct: most receive paths do not need to copy. The MSG_SPLICE_PAGES commits were correct on TCP, where the splice path already set `SKBFL_SHARED_FRAG` and TCP's receivers checked it. UDP's receivers either did not look (ESP's fast path), or looked at the wrong field (RxRPC's `skb_cloned`), and UDP itself did not propagate the flag.\n\nThe bug is the intersection. Each commit landed in a subsystem whose maintainers were responsible for that subsystem and whose reviewers understood that subsystem. None of them was reading every receive path that runs in-place crypto.\n\n## Two CVEs because the kernel ships two transports without the flag\n\nThe Dirty Frag chain is two CVEs because no single bug crosses every distro. From V4bel's writeup:\n\n| Environment | ESP variant | RxRPC variant |\n|---|---|---|\n| user namespaces allowed, esp4.ko present | works | not needed |\n| user namespaces blocked, rxrpc.ko present | does not reach SA install | works |\n| user namespaces blocked, no rxrpc.ko | blocked | not loaded |\n\nThe ESP variant needs unprivileged user namespaces to install an XFRM SA in the attacker's network namespace. Ubuntu disables unprivileged user namespaces by default; the ESP variant does not reach root there. RHEL allows unprivileged user namespaces but does not ship `rxrpc.ko`; the RxRPC variant does not load. A single bug roots one of those two distros and not the other. The chain is two bugs because no one bug crosses the major distros.\n\nV4bel's PoC `main()` runs the ESP variant first and falls back to RxRPC on failure. The fallback is part of the disclosure: the paper is one paper, the kernel patches it as two CVEs, the operator ships it as one binary.\n\n## Determinism, and why this class travels\n\nFrom the family-tree section of V4bel's writeup:\n\n```\nDirty Pipe (CVE-2022-0847, 2022)\n  splice() + pipe_buffer  ->  arbitrary page-cache write\nCopy Fail (CVE-2026-31431, 2026)\n  splice() + AF_ALG       ->  4-byte page-cache write\nDirty Frag (2026)\n  splice() + skb->frag    ->  4-byte (ESP) / 8-byte (RxRPC) page-cache write\n```\n\nNone of these is a race. None requires a timing window. None requires a kernel-version-specific offset database. The Copy Fail demo on `copy.fail` showed the same exploit binary rooting four different distributions in one take. Dirty Frag inherits the property: the same binary works against any kernel that loads `esp4` (with userns) or `rxrpc`, with no per-distribution tuning. For a class of kernel LPE that historically required race-window control, offset databases, and feedback loops, a deterministic primitive is the design upgrade. Patching individual receivers does not return the class to the prior economics.\n\n## SKBFL_SHARED_FRAG is the contract becoming the flag\n\nThe class is [borrowed-pages-as-scratch](/patterns/borrowed-pages-as-scratch): a subsystem performs scratch writes into a destination buffer under an internal contract that it owns the memory; another subsystem supplies that buffer with foreign-owned pages. The contract is documentation; the legitimate scratch becomes a write primitive across a trust boundary nobody guards. The first public instance the catalog named was [CVE-2026-31431, in AF_ALG's SGL](/posts/linux-copyfail-cve-2026-31431-the-bug-is-not-in-authencesn). Both Dirty Frag CVEs are the same shape against `skb->frag`.\n\nWhat is new in mainline is that the convention is now metadata. TCP was already setting `SKBFL_SHARED_FRAG` on splice-sourced frags. The patch sets it on UDP append. The patch checks it in `esp_input`'s `skip_cow` branch. RxRPC's submitted patch widens the unshare gate from `skb_cloned(skb)` to `(skb_cloned(skb) || skb->data_len)` so the cloned-only check stops missing paged-frag skbs.\n\n`SKBFL_SHARED_FRAG` is what the kernel calls the trust boundary now that there are two CVEs to make the boundary a flag instead of a convention.\n\nThe kernel had the metadata. One transport was setting it. The next splice-shaped CVE is the next receive path that runs in-place crypto on a frag the kernel does not own.\n\nPoC: [V4bel/dirtyfrag](https://github.com/V4bel/dirtyfrag) and [theori-io/copy-fail-CVE-2026-31431](https://github.com/theori-io/copy-fail-CVE-2026-31431)","closing_line":"The kernel had the metadata. One transport was setting it. The next splice-shaped CVE is the next receive path that runs in-place crypto on a frag the kernel does not own.","hook_md":"On May 5, 2026, commit `f4c50a4034e62ab75f1d5cdd191dd5f9c77fdff4` landed in the upstream Linux kernel. Author Kuan-Ting Chen, committer Steffen Klassert. Title: `xfrm: esp: avoid in-place decrypt on shared skb frags`. The first paragraph of the message is the diagnosis:\n\n> MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(), so later paths that may modify packet data can first make a private copy. The IPv4/IPv6 datagram append paths did not set this flag when splicing pages into UDP skbs.\n>\n> That leaves an ESP-in-UDP packet made from shared pipe pages looking like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW fast path for uncloned skbs without a frag_list and decrypts in place over data that is not owned privately by the skb.\n\nThe kernel had the trust boundary already. It had the metadata, it had the convention, and it had two receiver paths checking the metadata. The transport that wrote the convention down (TCP) set the flag. The transport that ESP runs over (UDP) did not. CVE-2026-43284 is the gap between those two facts. CVE-2026-43500 is the same gap in RxRPC, where the receiver gates on `skb_cloned(skb)`, and a paged-frag skb is not cloned.","post_id":192,"slug":"dirty-frag-tcp-set-the-flag-udp-did-not","title":"CVE-2026-43284 + CVE-2026-43500: The Flag Was on TCP. UDP Did Not Set It.","type":"initial","unreadable_sentence":"SKBFL_SHARED_FRAG is what the kernel calls the trust boundary now that there are two CVEs to make the boundary a flag instead of a convention."}
-----BEGIN PGP SIGNATURE-----

iHUEARYIAB0WIQRf0htP5+SjynlxywneZjl4jgkQJgUCaf9DsQAKCRDeZjl4jgkQ
JgfYAQCQW44yROHHnT3iueZqu/kevT+OQvnW/VDtrZornHE0VwD/RRmv5T+V1gA0
JBp0YaeDgMmEGUPd0sJyTncaPI81GQI=
=hiQq
-----END PGP SIGNATURE-----