gitlab-org/gitlab
Alerting
How alerts are wired to the metrics and SLIs.
Where alerts live
- GitLab.com: alert rules live in the runbooks repo, separate from this monorepo. They reference SLIs and Prometheus metrics emitted by the monolith.
- Self-managed: bundled Prometheus rules in omnibus-gitlab and charts/gitlab.
Common alert types
- Apdex breach on a request urgency class (e.g.,
high-urgency-webapdex below 99.5% over 5 minutes). - Error budget burn (multi-window, multi-burn-rate alerts).
- Sidekiq queue length above per-urgency thresholds.
- Database replication lag above 10 seconds.
- Gitaly RPC error rate above threshold.
- Workhorse 5xx above threshold.
- Sidekiq dead set growth rate.
- Memory watchdog killings above a baseline.
- Feature-flag toggle anomalies (sudden spikes after a flag flip).
Adding an alert
For self-managed: open an MR against omnibus-gitlab (Prometheus rules) or charts/gitlab (alerting templates).
For GitLab.com: open an MR against the runbooks repo, including:
- The Prometheus query.
- Burn-rate windows.
- Severity (S1, S2, etc.) and group ownership.
- A runbook link.
The alert eventually emits to PagerDuty and Slack.
Service-owner ownership
Each alert is owned by a feature category, and the routing service sends pages to the corresponding on-call group. The mapping comes from config/feature_categories.yml and the runbooks repo.
Related
- Metrics, Logging, Tracing.
- Background — noisy alerts that the team has triaged historically.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.