hashicorp/consul
Debugging
How to figure out what a running Consul cluster is doing.
Logging
- The agent uses
github.com/hashicorp/go-hclog. Configuration lives inlogging/and is wired inagent/setup.go. - Log level is configurable via the
log_levelagent config (or-log-level=). Common values:trace,debug,info,warn,error. - Per-subsystem loggers are obtained with
logger.Named("rpc"),logger.Named("xds"), etc. Search the codebase forNamed(to find all the subsystem names. - For high-volume xDS traffic, the agent supports log dropping via
agent/log-drop/to avoid saturating disks.
consul debug
consul debug gathers a bundle of Raft state, metrics, profiles, agent config, and logs over a configurable duration. Code: command/debug/.
consul debug -duration=2m -interval=10s -archive=trueOpen the archive with any tar tool. The bundle is what HashiCorp Support typically asks for.
consul troubleshoot
The troubleshoot/ Go module powers the consul troubleshoot CLI for mesh wiring problems:
consul troubleshoot upstreams # List Envoy upstreams visible from a sidecar
consul troubleshoot proxy # Check that an Envoy proxy is healthy
consul troubleshoot ports # Verify mTLS port reachabilityCode in troubleshoot/proxy/ and troubleshoot/connect/. CLI commands in command/troubleshoot/.
Profiling and pprof
The HTTP API exposes pprof when enable_debug=true is set on the agent. Endpoints are mounted in agent/http_register.go under /debug/pprof/.
go tool pprof http://127.0.0.1:8500/debug/pprof/profile?seconds=30Common error patterns
| Symptom | Likely cause / where to look |
|---|---|
No cluster leader |
Raft hasn't elected. Check agent/consul/leader.go start-up, gossip connectivity, time skew |
connection refused from clients to servers |
RPC port (8300) blocked, or agent not started; see agent/consul/server.go:listen |
RPC failed to server X repeatedly |
Server is down or partitioned. Autopilot logs in agent/consul/autopilot.go |
permission denied on KV/catalog |
ACL token missing or insufficient. Trace acl/acl.go and agent/consul/acl_endpoint.go |
Envoy reports xds: stream config update failed |
agent/xds/delta.go rejects an update; usually a stale or invalid resource |
| Mesh mTLS handshake errors | Leaf cert expired or root rotated. See agent/leafcert/ and agent/consul/leader_connect_ca.go |
intention denied |
A service-intentions config entry blocks the call. Inspect via consul intention list |
| Anti-entropy spam in logs | Local services drifting from server view. Check agent/local/ and agent/ae/ |
peering: invalid token |
Peering bootstrap token mismatch; see agent/consul/peering_backend.go |
Tracing slow queries
agent/blockingquery/ and agent/consul/rpc.go contain the blocking query plumbing. To see what's slow:
- Enable
enable_agent_tls_for_checksandmetrics.prometheus_retention_timeand scrape the agent's/v1/agent/metrics?format=prometheus. - Look at
consul.rpc.query.*andconsul.fsm.*metrics. - For xDS, watch
consul.xds.server.streamDrainedand the per-resource metrics inagent/xds/server.go.
Reproducing in a unit test
Most subsystems have a testing.go with builders. Pattern:
a := agent.NewTestAgent(t, `
server = true
bootstrap = true
`)
defer a.Shutdown()
// Hit the HTTP API
req := httptest.NewRequest("GET", "/v1/catalog/services", nil)
resp := httptest.NewRecorder()
a.HTTPAPI().ServeHTTP(resp, req)Reach into the FSM directly when isolating Raft or state-store regressions:
fsm := fsm.New(...)
fsm.Apply(&raft.Log{Data: encodedRequest})
fsm.State().<index>.<getter>(...)See Patterns and conventions for the standard ways tests construct dependencies.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.