Open-Source Wikis

/

Pingora

/

Pingora

/

Architecture

cloudflare/pingora

Architecture

This page covers how a Pingora binary is organized at runtime: the Server that owns the process, the Services it hosts, the L4 listeners and connectors that feed them, and the HTTP proxy framework on top.

The original architecture write-up by James Munns lives at docs/user_guide/internals.md and is the best longer-form source. This page summarizes that document and points at the source.

The big picture

graph TD
    Server[Server<br/>process owner]
    SvcA[Service<br/>HttpProxy app]
    SvcB[Service<br/>HttpProxy app]
    SvcC[BackgroundService<br/>health checks, metrics]
    RtA[Tokio runtime A<br/>N worker threads]
    RtB[Tokio runtime B<br/>N worker threads]
    RtC[Tokio runtime C]
    L4[L4 listeners<br/>TCP / UDS / TLS]
    Conn[Connectors<br/>HTTP/1, HTTP/2, raw L4]

    Server --> SvcA
    Server --> SvcB
    Server --> SvcC
    SvcA --> RtA
    SvcB --> RtB
    SvcC --> RtC
    SvcA --> L4
    SvcA --> Conn
    SvcB --> L4
    SvcB --> Conn

A single Server (pingora-core/src/server/mod.rs) owns the process. It parses CLI options, reads YAML config, daemonizes if asked, sets up signal handlers, and spawns one tokio runtime per service. Each Service (pingora-core/src/services/mod.rs) listens on one or more endpoints and runs an "app" that implements ServerApp or HttpServerApp.

The two "shapes" of services in practice are:

  • Listener services — accept downstream connections. The HttpProxy from pingora-proxy is a listener app.
  • Background services — periodic work, no socket. Health checking and discovery for a load balancer typically runs as a background service. See pingora-load-balancing/src/background.rs.

Server bootstrap

Server::new reads Opt from CLI (--conf, --daemon, --upgrade, etc.). Server::bootstrap reads YAML configuration via serde_yaml. Server::run_forever daemonizes if configured, starts a runtime per service, and blocks the main thread on signal handlers.

Lifecycle states are tracked in the ExecutionPhase enum (pingora-core/src/server/mod.rs):

Phase Meaning
Setup Server::new returned, services not yet added
Bootstrap Acquiring listening FDs (including from old process during graceful upgrade)
BootstrapComplete FDs ready, services starting
Running Serving traffic
GracefulUpgradeTransferringFds New process is taking over; sending FDs over Unix socket
GracefulTerminate SIGTERM received, draining sessions
ShutdownStarted / ShutdownGracePeriod / ShutdownRuntimes / Terminated Wind-down stages

Two timeouts gate shutdown: EXIT_TIMEOUT (300s) for in-flight sessions to drain, and CLOSE_TIMEOUT (5s) for the new process to take the listening sockets after FD transfer.

Services may declare dependencies on other services (added in 0.8.0). The dependency graph is a daggy::Dag resolved in Server::run_forever so that a service waits for its dependents to be ready before it starts accepting traffic.

Anatomy of a request (proxy)

sequenceDiagram
    participant Client
    participant L4 as L4 listener<br/>(pingora-core)
    participant Proxy as HttpProxy<br/>(pingora-proxy)
    participant App as ProxyHttp impl<br/>(user code)
    participant Pool as Connection pool<br/>(pingora-pool)
    participant Upstream

    Client->>L4: TCP/TLS connect
    L4->>Proxy: handshake + ServerSession
    Proxy->>App: request_filter / upstream_peer
    App-->>Proxy: HttpPeer
    Proxy->>Pool: acquire conn for peer
    Pool-->>Proxy: existing or fresh stream
    Proxy->>Upstream: forward request headers + body
    Upstream-->>Proxy: response headers + body
    Proxy->>App: response_filter / response_body_filter
    Proxy->>Client: forward downstream
    Proxy->>Pool: release conn (if reusable)

Each phase corresponds to a callback on the ProxyHttp trait (pingora-proxy/src/proxy_trait.rs). The full life of a request is documented in Proxy phases.

L4 layer

pingora-core/src/listeners builds inbound TransportStacks — pairs of a tokio::net::TcpListener (or Unix socket) and an optional TLS acceptor. pingora-core/src/connectors does the symmetric thing outbound: opens TCP/UDS sockets, performs TLS handshake, and offers both HTTP/1 and HTTP/2 client sessions.

Listeners can optionally apply a ConnectionFilter (pingora-core/src/listeners/connection_filter.rs) for pre-TLS allow/deny decisions. This is gated behind the connection_filter cargo feature.

Modules

pingora-core/src/modules defines a small "module" abstraction — pluggable per-request middleware that runs alongside the ProxyHttp callbacks. The standard module today is response compression (modules/http/compression). Users register modules at session creation time. The adjust_upstream_modules feature (added in 0.7.0) lets a proxy reconfigure upstream modules based on the response header.

TLS abstraction

Pingora supports four TLS backends. They are mutually exclusive cargo features on the umbrella crate: openssl, boringssl, s2n, rustls. The pingora-core TLS subdirectory has a thin trait-based shim and one implementation directory per backend. See TLS options and the four backend crate pages under Packages.

Cache

The pingora-cache crate is a separate state machine (HttpCache) that hangs off the proxy session. It implements the cache lookup, miss-fill, lock, and revalidation phases of HTTP caching. The proxy-side glue lives in pingora-proxy/src/proxy_cache.rs.

Observability

  • Logging uses the log crate. The server can route logs to a file via the error_log config key.
  • Metrics are Prometheus by default. pingora-prometheus is a small wrapper crate (split out in 0.8.0) that wires the built-in metrics into a prometheus_http service so they can be scraped.
  • Sentry integration is optional (sentry cargo feature) for panic and error reporting.

See Observability for details.

Threading model

A service has its own tokio runtime. Two flavors: work-stealing (the default Runtime::Steal) and "no-steal" — N isolated single-threaded runtimes. The latter avoids cross-thread synchronization at the cost of less even load distribution. Configurable via the work_stealing YAML key.

CPU-blocking work (TLS handshakes, decompression) runs on tokio's blocking thread pool. The runtime crate (pingora-runtime) wraps tokio's Runtime so that callers don't have to care which flavor they got.

Source map

Concern Path
Server lifecycle pingora-core/src/server/mod.rs
Server config (YAML, CLI) pingora-core/src/server/configuration/
Listener stacks (TCP, UDS, TLS) pingora-core/src/listeners/
Outbound connectors pingora-core/src/connectors/
HTTP/1 server / client pingora-core/src/protocols/http/v1/
HTTP/2 server / client pingora-core/src/protocols/http/v2/
Proxy entry point pingora-proxy/src/lib.rs
Proxy callbacks (trait) pingora-proxy/src/proxy_trait.rs
Proxy h1 main loop pingora-proxy/src/proxy_h1.rs
Proxy h2 main loop pingora-proxy/src/proxy_h2.rs
Proxy cache integration pingora-proxy/src/proxy_cache.rs
Cache state machine pingora-cache/src/lib.rs
Connection pool pingora-pool/src/connection.rs
Load balancer pingora-load-balancing/src/lib.rs
Tokio runtime wrapper pingora-runtime/src/lib.rs

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Architecture – Pingora wiki | Factory