Envoy 503 race condition

Introduction

In complex architectures involving multiple chained proxies certain configurations can expose subtle race conditions, particularly around HTTP 503 errors when there is a mismatch in TCP keep-alive expectations between Envoy and upstream services.

Scenario

Consider a chain of proxies:

Client → Envoy (Proxy A) → Envoy (Proxy B) → Upstream Service

This behavior can lead to a subtle race condition:

Proxy A pulls a connection from its pool, believing it is active.
Proxy B has closed this connection due to an idle timeout.
Proxy A sends a request; the TCP connection resets.
Envoy reports a 503 to the client—even though the upstream might have been healthy if a new connection had been established immediately.

Mitigation Strategies

To reduce the risk of this 503 race condition:

Align Keep-Alive Timeouts: Ensure all chained proxies have compatible idle timeout and max requests per connection settings. Envoy HTTP idletimeout should be less than upstream proxy!
Use TCP Health Checks: Enable active health checks so Envoy can detect closed upstream connections before attempting reuse.
Disable Aggressive Connection Reuse: In cases where upstream stability is uncertain, reducing connection reuse or lowering idle timeouts can help.
Enable Retry Policies: Configure Envoy retries for idempotent requests to transparently recover from transient 503s caused by stale connections.
Monitor TCP Resets: Observability into connection resets between proxies can help identify when mismatched keep-alives are the root cause.

Check real HTTP keepalive timeout

Run tcpdump:

sudo tcpdump -iany -n -nn host 51.75.162.65 and port 443

then, check the RST flag sent from the upstream proxy, deduct the time from the last ACK packet = your IDLE timeout.

Envoy 503 race condition

Introduction

Scenario

Mitigation Strategies

Check real HTTP keepalive timeout

Author: pawel

Leave a Comment Cancel reply

Introduction

Scenario

Mitigation Strategies

Check real HTTP keepalive timeout

Author: pawel

Related Posts

Leave a Comment Cancel reply