Skip to content

Questionable behavior with links + proxy headers #872

@alukach

Description

@alukach

Working on the STAC Auth Proxy, we've noticed some strange behavior with links generated by FastAPI when the application is deployed behind a proxy. Specifically, we were getting unexpected origins for the links.

Background

An example:

Image a containerized environment where the STAC API is served at http://stac.internal:8000 and we front the application with a reverse proxy (e.g. Traefik) at https://example.com. A request comes in to https://example.com/collections:

GET /collections
Headers:
  Host: example.com

The reverse proxy will take this request and forward it to the STAC API with proxy headers:

GET /collections
Headers:
  Host: stac.internal:8000
  Forwarded: proto=https;host=example.com

As written today, the code in stac_fastapi.api.middleware:ProxyHeaderMiddleware._get_forwarded_url_parts() will fail to retrieve the port_str from the Forwarded header. In such case, it will use the port that it got from the host header.

This means that links will be generated for https://example.com:8000 (ie the wrong port)!

header_host_parts = header_host.split(":")
if len(header_host_parts) == 2:
domain, port = header_host_parts
else:
domain = header_host_parts[0]
port = None
port_str = None # make sure it is defined in all paths since we access it later
if forwarded := self._get_header_value_by_name(scope, "forwarded"):
for proxy in forwarded.split(","):
if proto_expr := _PROTO_HEADER_REGEX.search(proxy):
proto = proto_expr.group("proto")
if host_expr := _HOST_HEADER_REGEX.search(proxy):
domain = host_expr.group("host")
port_str = host_expr.group("port") # None if not present in the match
else:
domain = self._get_header_value_by_name(scope, "x-forwarded-host", domain)
proto = self._get_header_value_by_name(scope, "x-forwarded-proto", proto)
port_str = self._get_header_value_by_name(scope, "x-forwarded-port", port)
with contextlib.suppress(ValueError): # ignore ports that are not valid integers
port = int(port_str) if port_str is not None else port

We worked around our immediate issues via developmentseed/stac-auth-proxy#115 which ensures that we always specify the port in the Forwarded header, however I do think it's worth considering altering the stac-fastapi behavior.

Recommended Actions

Infer Ports based on Protocol

If either Forwarded or X-Forwarded-* indicate the protocol but not the port of the proxy, we should utilize the default ports for that protocol (ie port = 443 if proto == 'https' else 80) rather than deferring to the Host header.

Prefer proxy headers first, then server, then host

If an incoming request has proxy headers, we should make full use of those for generating links (which we basically do already, aside from the ports issue I described above).

But if a request doesn't contain proxy headers, I believe that the server should take precedence over the host header.

Change Function Signature

We don't actually abide by our function signature, as we actually return Tuple[str, str, int]:

def _get_forwarded_url_parts(self, scope: Scope) -> Tuple[str, str, str]:

Review Tests

I believe the following test cases highlight our incorrect logic. We should correct and expand our test cases and then reactively update stac_fastapi.api.middleware:ProxyHeaderMiddleware._get_forwarded_url_parts() to meet these requirements.

Test Case 0

Important

I'm not certain of this one. It's unclear to me how server is generated in actual practice

If the protocol is https and no port is defined, we should assume port 443 (ie, this test should expect ("https", "testserver", 443)):

(
{"scheme": "https", "server": ["testserver", 80], "headers": []},
("https", "testserver", 80),
),

Test Case 2

Anyways, I think this should should ("http", "testserver", 80):

(
{
"scheme": "http",
"server": ["testserver", 80],
"headers": [(b"host", b"testserver")],
},
("http", "testserver", None),
),

Test Case 4

I think that if a request comes in with a bad port, we should infer based on the proto setting of the forwarded header (ie, this test should expect ("https", "test", 443)):

(
{
"scheme": "http",
"server": ["testserver", 80],
"headers": [(b"forwarded", b"proto=https;host=test:not-an-integer")],
},
("https", "test", 80),
),

Test Case 6

Again, if we're getting a proto proxy header, we should infer the port (ie, this test should expect ("https", "testserver", 443)):

(
{
"scheme": "http",
"server": ["testserver", 80],
"headers": [(b"x-forwarded-proto", b"https")],
},
("https", "testserver", 80),
),

Test Case 12

Again, infer port based on proto (ie this test should expect ("https", "testserver", 443)):

(
{
"scheme": "http",
"server": ["testserver", 80],
"headers": [
(
b"forwarded",
# proto is set, but no host
b'for="85.193.181.55";proto=https,for="85.193.181.55";proto=https',
)
],
},
("https", "testserver", 80),
),

(new) Test Case 13

Should use default port for https (443) based on forwarded proto, not the original host port (8080)

        (
            {
                "scheme": "http",
                "server": ["testserver", 8080],
                "headers": [
                    (b"host", b"example.com:8080"),
                    (
                        b"forwarded",
                        # Forwarded header with proto and host but no port
                        # Should use default port for https (443), not the original host port (8080)
                        b"proto=https;host=myproxy.com",
                    ),
                ],
            },
            ("https", "myproxy.com", 443),
        ),

(new) Test Case 14

Should use default port for http (80) based on forwarded proto, not the original host port (8080)

        (
            {
                "scheme": "http",
                "server": ["testserver", 8080],
                "headers": [
                    (b"host", b"example.com:8080"),
                    (
                        b"forwarded",
                        # Forwarded header with proto and host but no port
                        # Should use default port for http (80), not the original host port (8080)
                        b"proto=http;host=myproxy.com",
                    ),
                ],
            },
            ("http", "myproxy.com", 80),
        ),

(new) Test Case 15

Should use default port for https (443) based on x-forwarded-proto, not the original host port (8080)

        (
            {
                "scheme": "http",
                "server": ["testserver", 8080],
                "headers": [
                    (b"host", b"example.com:8080"),
                    (b"x-forwarded-proto", b"https"),
                    (b"x-forwarded-host", b"myproxy.com"),
                    # No x-forwarded-port header
                    # Should use default port for https (443), not the original host port (8080)
                ],
            },
            ("https", "myproxy.com", 443),
        ),

(new) Test Case 16

Should use default port for http (80) based on x-forwarded-proto, not the original host port (8080)

        (
            {
                "scheme": "http",
                "server": ["testserver", 8080],
                "headers": [
                    (b"host", b"example.com:8080"),
                    (b"x-forwarded-proto", b"http"),
                    (b"x-forwarded-host", b"myproxy.com"),
                    # No x-forwarded-port header
                    # Should use default port for http (80), not the original host port (8080)
                ],
            },
            ("http", "myproxy.com", 80),
        ),

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions