-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 NEFARIOUSPLAN-CANONICAL-V1 {"body_md":"## The mock is more vulnerable than the real code\n\nThe PoC's `mock_airflow.py` builds its `OneLogin_Saml2_Auth` object from settings whose ACS URL is derived directly from the request:\n\n```python\ndef _init_saml_auth(req):\n request_data = _prepare_request_VULNERABLE(req)\n settings = {\n \"sp\": {\n \"entityId\": \"aws-auth-manager-saml-client\",\n \"assertionConsumerService\": {\n \"url\": f\"http://{request_data['http_host']}:{request_data['server_port']}/login_callback\",\n \"binding\": \"urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST\",\n },\n },\n }\n```\n\n`request_data['http_host']` is produced by `_prepare_request_VULNERABLE`, which reads `req.headers.get(\"Host\", req.host)`. In the mock, spoofing `Host` changes the ACS URL the `AuthnRequest` carries to the IdP. The IdP honors the AuthnRequest, sees an ACS URL pointing at `attacker.com`, and sends the eventual SAML response to `attacker.com`. The attack flow the README draws out, attacker spoofs Host, IdP redirects SAML response to attacker, attacker replays, is a faithful description of what the mock does.\n\nReal Airflow does not look like this. Here is `_init_saml_auth` from `providers/amazon/src/airflow/providers/amazon/aws/auth_manager/routes/login.py` in the pre-patch tree (commit `8edb3130`, parent of the fix):\n\n```python\ndef _init_saml_auth(request: Request) -> OneLogin_Saml2_Auth:\n request_data = _prepare_request(request)\n base_url = conf.get(section=\"api\", key=\"base_url\")\n settings = {\n \"debug\": True,\n \"sp\": {\n \"entityId\": \"aws-auth-manager-saml-client\",\n \"assertionConsumerService\": {\n \"url\": f\"{base_url.rstrip('/')}{AUTH_MANAGER_FASTAPI_APP_PREFIX}/login_callback\",\n \"binding\": \"urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST\",\n },\n },\n }\n merged_settings = OneLogin_Saml2_IdPMetadataParser.merge_settings(_get_idp_data(), settings)\n return OneLogin_Saml2_Auth(request_data, merged_settings)\n```\n\nThe ACS URL in settings is hardcoded. It comes from `conf.get(\"api\", \"base_url\")`, the administrator's configuration. A spoofed `Host` header does not reach it. The `AuthnRequest` that leaves the Airflow server carries the configured base URL as its ACS URL. The IdP redirects the authenticated SAML response to that URL. Not to the attacker.\n\nAnyone following the PoC's attack-flow diagram against a real Airflow instance will see the IdP redirect their browser to the victim's real base URL, land on the victim's real `/login_callback`, and authenticate as themselves. Nothing spoofed. Nothing captured.\n\n## The Host header went somewhere else\n\n`_init_saml_auth` and `_prepare_request` both take the same `request` object. They look at two different fields of it. `_init_saml_auth` uses `conf.get(\"api\", \"base_url\")` for the ACS URL in settings. `_prepare_request` builds a parallel dict called `request_data`, and pre-patch, it did this:\n\n```python\ndef _prepare_request(request: Request) -> dict:\n host = request.headers.get(\"host\", request.client.host if request.client else \"localhost\")\n data: dict[str, Any] = {\n \"https\": \"on\" if request.url.scheme == \"https\" else \"off\",\n \"http_host\": host,\n \"server_port\": request.url.port,\n \"script_name\": request.url.path,\n \"get_data\": request.query_params,\n \"post_data\": {},\n }\n```\n\n`request_data['http_host']` comes from the `Host` header, with a fallback to the client IP. This dict is passed as the first argument to `OneLogin_Saml2_Auth(request_data, merged_settings)`, alongside the settings object that carries the hardcoded ACS URL.\n\nTwo fields. Same call. Same concept, what is this server's hostname? One hardcoded to configuration. One derived from whatever the client sent. The only way to notice the inconsistency is to read both functions in the same sitting. The PR reviewers on the original AWS Auth Manager merge did not. Neither did any reviewer of any subsequent change to this file, for about a year.\n\n## What http_host actually drives\n\npython3-saml uses `request_data['http_host']` to compute the server's own URL. The relevant utility is `OneLogin_Saml2_Utils.get_self_url_host(request_data)`, which returns `http_host` directly, and `get_self_url_no_query(request_data)`, which concatenates `http_host` with `script_name` to produce a URL string.\n\nCall sites for `get_self_url_no_query` include `OneLogin_Saml2_Response.is_valid()`, the function that validates incoming SAML responses. The relevant check:\n\n```python\ncurrent_url = OneLogin_Saml2_Utils.get_self_url_no_query(request_data)\nif not OneLogin_Saml2_Utils.normalize_url(url=destination).startswith(\n OneLogin_Saml2_Utils.normalize_url(url=current_url)\n):\n raise OneLogin_Saml2_ValidationError(\n \"The response was received at %s instead of %s\" % (current_url, destination),\n OneLogin_Saml2_ValidationError.WRONG_DESTINATION,\n )\n```\n\n`destination` is the `Destination` attribute of the SAML response, set by the IdP and covered by the IdP's signature. It names the ACS URL the response was intended for. SAML specifies `Destination` precisely to prevent a response issued for one server from being accepted by another.\n\n`current_url` is what the library thinks it is. `current_url` is built from `http_host`. `http_host` is the `Host` header.\n\nThe anti-replay control is a comparison between a signed field and a request-controlled field. This is the shape the pattern Host Header As Self names: the server treats the `Host` header as authoritative for \"what URL am I,\" and any signed claim about \"intended for this URL\" collapses to \"intended for whatever URL the caller nominated.\"\n\n## The attack is cross-instance replay\n\nThe CVE description, read literally, is specific about what the bug enables:\n\n> the origin of the SAML authentication has been used as provided by the client and not verified against the actual instance URL. This allowed to gain access to different instances with potentially different access controls by reusing SAML response from other instances.\n\nReplay across instances. The PoC README labels this \"Scenario B\" and gives it one paragraph. \"Scenario A\" is phishing via spoofed `Host` at `/login` to capture a SAMLResponse at `attacker.com`. Scenario A does not work against real Airflow. Scenario B does.\n\nConsider a company running Airflow in two places: `airflow-dev.corp` and `airflow-prod.corp`, both federated to the same AWS IAM Identity Center, both running the AWS Auth Manager, each with separate RBAC. A user with legitimate low-privilege access to `airflow-dev` completes a SAML login. The SAML response their browser receives is signed by the IdP, with `Destination=\"https://airflow-dev.corp/auth/amazon-aws/login_callback\"`.\n\nThe attacker, who is the user, captures that response out of their own browser and POSTs it to `airflow-prod.corp/auth/amazon-aws/login_callback` with `Host: airflow-dev.corp`.\n\nOn prod, `_prepare_request` reads `Host: airflow-dev.corp`. `request_data['http_host']` becomes `airflow-dev.corp`. python3-saml computes `current_url` as `https://airflow-dev.corp/auth/amazon-aws/login_callback`. It compares that to the signed `Destination` on the response, `https://airflow-dev.corp/auth/amazon-aws/login_callback`. The `startswith` check passes. The IdP's signature is valid because the IdP did sign it. The response processes. The user's identity, verified by the IdP for airflow-dev, is admitted into airflow-prod.\n\nWhatever RBAC prod assigns to that user based on the IdP claims is now accessible. The CVE description calls this \"potentially different access controls.\" In an environment where dev is permissive and prod is restrictive for the same IdP groups, that delta is the attack. The CVSS vector's `PR:L` is load-bearing: the attacker needs a legitimate account somewhere, just not on the instance they end up inside.\n\n## The patch\n\nThe fix is four lines:\n\n```diff\n def _prepare_request(request: Request) -> dict:\n- host = request.headers.get(\"host\", request.client.host if request.client else \"localhost\")\n+ parsed = urlparse(conf.get(\"api\", \"base_url\", fallback=\"http://localhost\"))\n+ host = parsed.hostname\n+\n data: dict[str, Any] = {\n \"https\": \"on\" if request.url.scheme == \"https\" else \"off\",\n \"http_host\": host,\n```\n\n`http_host` now comes from `conf.get(\"api\", \"base_url\")`, parsed. The field agrees with the hardcoded ACS URL that `_init_saml_auth` has been producing since the module shipped. Both represent the same concept and now read from the same source.\n\nVincent Beck of the Apache Airflow maintainer pool wrote the fix and titled the PR \"Fix `host` in AWS auth manager.\" Sungwuk Jung, credited as the reporter, found it. MITRE assigned CWE-346, Origin Validation Error. The CVSS vector `AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:N` scores 5.4. Versions 8.0.0 through 9.21.x of `apache-airflow-providers-amazon` ship the original behavior; 9.22.0 is the first version that does not.\n\n## Two fields, one concept, two sources\n\n`_init_saml_auth` and `_prepare_request` handle \"the server's own identity\" inside the same SAML settings tree. They were written with different assumptions from the start. The settings branch assumed the server knows its own configured base URL. The `request_data` branch assumed the server should derive its URL from the inbound request. These assumptions contradict. The contradiction is invisible until you read both functions at the same time, which nobody did. The contradiction is the bug. The patch is one function adopting the assumption the other function had already committed to.\n\nThe PoC shipped a mock that collapsed this inconsistency into a stronger, simpler bug: it built the ACS URL from `http_host` directly, so a spoofed `Host` produced a visible, dramatic outcome in a single request. That is a teaching artifact about SAML host-header injection in general. It is not the bug as it existed in Apache Airflow. The bug that existed in Apache Airflow required reading `_init_saml_auth` and `_prepare_request` together and tracing `http_host` through the python3-saml library to the `Destination` validation. Most coverage of CVE-2026-25604 will reproduce the PoC's Scenario A diagram and call it done.\n\nPoC: [John-Jung/CVE-2026-25604-PoC](https://github.com/John-Jung/CVE-2026-25604-PoC)\n","closing_line":"The mock was a stronger version of the real bug. The real bug shipped for a year because it required reading two functions at once.","hook_md":"The PoC for CVE-2026-25604 runs a Flask mock. The mock builds the SAML `AssertionConsumerService` URL from the `Host` request header and shows an `AuthnRequest` that instructs AWS IAM Identity Center to redirect the SAML response to `attacker.com:9090`. That is not what Apache Airflow does. The real `_init_saml_auth` hardcodes the ACS URL to `conf.get(\"api\", \"base_url\")`. The IdP redirects the SAML response to the configured base URL, not to the attacker. The PoC demonstrates the wrong endpoint. The real vulnerability lives on `/login_callback`, and the PoC never touches it.","post_id":42,"slug":"airflow-aws-saml-poc-tests-wrong-endpoint","title":"CVE-2026-25604: The PoC Tests /login. The Bug Is on /login_callback.","type":"initial","unreadable_sentence":"The PoC demonstrates the wrong endpoint. The real vulnerability lives on the other one."} -----BEGIN PGP SIGNATURE----- iHUEARYIAB0WIQRf0htP5+SjynlxywneZjl4jgkQJgUCagIN0AAKCRDeZjl4jgkQ Jj3mAP9iRaFhl1+nbwL51j47JlnS1ki4Z3Se0R5lMkwaZb0TvQEA2KfQWRVDhbNM EPp7HudyZoHIbhFR6+v7ylgRyPd+XwY= =2G72 -----END PGP SIGNATURE-----