cloudflare/pingora

Architecture

This page covers how a Pingora binary is organized at runtime: the Server that owns the process, the Services it hosts, the L4 listeners and connectors that feed them, and the HTTP proxy framework on top.

The original architecture write-up by James Munns lives at docs/user_guide/internals.md and is the best longer-form source. This page summarizes that document and points at the source.

The big picture

graph TD
    Server[Server<br/>process owner]
    SvcA[Service<br/>HttpProxy app]
    SvcB[Service<br/>HttpProxy app]
    SvcC[BackgroundService<br/>health checks, metrics]
    RtA[Tokio runtime A<br/>N worker threads]
    RtB[Tokio runtime B<br/>N worker threads]
    RtC[Tokio runtime C]
    L4[L4 listeners<br/>TCP / UDS / TLS]
    Conn[Connectors<br/>HTTP/1, HTTP/2, raw L4]

    Server --> SvcA
    Server --> SvcB
    Server --> SvcC
    SvcA --> RtA
    SvcB --> RtB
    SvcC --> RtC
    SvcA --> L4
    SvcA --> Conn
    SvcB --> L4
    SvcB --> Conn

A single Server (pingora-core/src/server/mod.rs) owns the process. It parses CLI options, reads YAML config, daemonizes if asked, sets up signal handlers, and spawns one tokio runtime per service. Each Service (pingora-core/src/services/mod.rs) listens on one or more endpoints and runs an "app" that implements ServerApp or HttpServerApp.

The two "shapes" of services in practice are:

Listener services — accept downstream connections. The HttpProxy from pingora-proxy is a listener app.
Background services — periodic work, no socket. Health checking and discovery for a load balancer typically runs as a background service. See pingora-load-balancing/src/background.rs.

Server bootstrap

Server::new reads Opt from CLI (--conf, --daemon, --upgrade, etc.). Server::bootstrap reads YAML configuration via serde_yaml. Server::run_forever daemonizes if configured, starts a runtime per service, and blocks the main thread on signal handlers.

Lifecycle states are tracked in the ExecutionPhase enum (pingora-core/src/server/mod.rs):

Phase	Meaning
`Setup`	`Server::new` returned, services not yet added
`Bootstrap`	Acquiring listening FDs (including from old process during graceful upgrade)
`BootstrapComplete`	FDs ready, services starting
`Running`	Serving traffic
`GracefulUpgradeTransferringFds`	New process is taking over; sending FDs over Unix socket
`GracefulTerminate`	`SIGTERM` received, draining sessions
`ShutdownStarted` / `ShutdownGracePeriod` / `ShutdownRuntimes` / `Terminated`	Wind-down stages

Two timeouts gate shutdown: EXIT_TIMEOUT (300s) for in-flight sessions to drain, and CLOSE_TIMEOUT (5s) for the new process to take the listening sockets after FD transfer.

Services may declare dependencies on other services (added in 0.8.0). The dependency graph is a daggy::Dag resolved in Server::run_forever so that a service waits for its dependents to be ready before it starts accepting traffic.

Anatomy of a request (proxy)

sequenceDiagram
    participant Client
    participant L4 as L4 listener<br/>(pingora-core)
    participant Proxy as HttpProxy<br/>(pingora-proxy)
    participant App as ProxyHttp impl<br/>(user code)
    participant Pool as Connection pool<br/>(pingora-pool)
    participant Upstream

    Client->>L4: TCP/TLS connect
    L4->>Proxy: handshake + ServerSession
    Proxy->>App: request_filter / upstream_peer
    App-->>Proxy: HttpPeer
    Proxy->>Pool: acquire conn for peer
    Pool-->>Proxy: existing or fresh stream
    Proxy->>Upstream: forward request headers + body
    Upstream-->>Proxy: response headers + body
    Proxy->>App: response_filter / response_body_filter
    Proxy->>Client: forward downstream
    Proxy->>Pool: release conn (if reusable)

Each phase corresponds to a callback on the ProxyHttp trait (pingora-proxy/src/proxy_trait.rs). The full life of a request is documented in Proxy phases.

L4 layer

pingora-core/src/listeners builds inbound TransportStacks — pairs of a tokio::net::TcpListener (or Unix socket) and an optional TLS acceptor. pingora-core/src/connectors does the symmetric thing outbound: opens TCP/UDS sockets, performs TLS handshake, and offers both HTTP/1 and HTTP/2 client sessions.

Listeners can optionally apply a ConnectionFilter (pingora-core/src/listeners/connection_filter.rs) for pre-TLS allow/deny decisions. This is gated behind the connection_filter cargo feature.

Modules

pingora-core/src/modules defines a small "module" abstraction — pluggable per-request middleware that runs alongside the ProxyHttp callbacks. The standard module today is response compression (modules/http/compression). Users register modules at session creation time. The adjust_upstream_modules feature (added in 0.7.0) lets a proxy reconfigure upstream modules based on the response header.

TLS abstraction

Pingora supports four TLS backends. They are mutually exclusive cargo features on the umbrella crate: openssl, boringssl, s2n, rustls. The pingora-core TLS subdirectory has a thin trait-based shim and one implementation directory per backend. See TLS options and the four backend crate pages under Packages.

Cache

The pingora-cache crate is a separate state machine (HttpCache) that hangs off the proxy session. It implements the cache lookup, miss-fill, lock, and revalidation phases of HTTP caching. The proxy-side glue lives in pingora-proxy/src/proxy_cache.rs.

Observability

Logging uses the log crate. The server can route logs to a file via the error_log config key.
Metrics are Prometheus by default. pingora-prometheus is a small wrapper crate (split out in 0.8.0) that wires the built-in metrics into a prometheus_http service so they can be scraped.
Sentry integration is optional (sentry cargo feature) for panic and error reporting.

See Observability for details.

Threading model

A service has its own tokio runtime. Two flavors: work-stealing (the default Runtime::Steal) and "no-steal" — N isolated single-threaded runtimes. The latter avoids cross-thread synchronization at the cost of less even load distribution. Configurable via the work_stealing YAML key.

CPU-blocking work (TLS handshakes, decompression) runs on tokio's blocking thread pool. The runtime crate (pingora-runtime) wraps tokio's Runtime so that callers don't have to care which flavor they got.

Source map

Concern	Path
Server lifecycle	`pingora-core/src/server/mod.rs`
Server config (YAML, CLI)	`pingora-core/src/server/configuration/`
Listener stacks (TCP, UDS, TLS)	`pingora-core/src/listeners/`
Outbound connectors	`pingora-core/src/connectors/`
HTTP/1 server / client	`pingora-core/src/protocols/http/v1/`
HTTP/2 server / client	`pingora-core/src/protocols/http/v2/`
Proxy entry point	`pingora-proxy/src/lib.rs`
Proxy callbacks (trait)	`pingora-proxy/src/proxy_trait.rs`
Proxy h1 main loop	`pingora-proxy/src/proxy_h1.rs`
Proxy h2 main loop	`pingora-proxy/src/proxy_h2.rs`
Proxy cache integration	`pingora-proxy/src/proxy_cache.rs`
Cache state machine	`pingora-cache/src/lib.rs`
Connection pool	`pingora-pool/src/connection.rs`
Load balancer	`pingora-load-balancing/src/lib.rs`
Tokio runtime wrapper	`pingora-runtime/src/lib.rs`

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.