Open-Source Wikis

/

GitLab

/

Apps

/

Sidekiq cluster

gitlab-org/gitlab

Sidekiq cluster

GitLab's asynchronous work runs on a fleet of Sidekiq worker processes. They share the Rails codebase but are launched as a separate process group via sidekiq_cluster/.

Purpose

  • Decouple slow or unreliable work from request processing.
  • Provide retry / dead-letter behavior for transient failures.
  • Run scheduled jobs (cron-like, via sidekiq-cron).
  • Drive the EventStore pub/sub.

Entry points

File Role
sidekiq_cluster/cli.rb (companion gem) CLI launcher for the cluster
bin/sidekiq-cluster Wrapper script
config/sidekiq_queues.yml Declared queue list (~33K lines)
app/workers/all_queues.yml Generated authoritative queue list
lib/gitlab/sidekiq_config/ Routing, fairness, urgency, weight config
app/workers/concerns/application_worker.rb Base concern for all workers

How it runs

The cluster starts one Sidekiq process per "queue selector" defined in config/sidekiq_cluster/:

sidekiq-cluster '*'                   # one process per CPU running every queue
sidekiq-cluster 'urgency=high'        # only high-urgency workers
sidekiq-cluster 'feature_category=continuous_integration'

Selectors are evaluated against worker metadata (urgency, feature category, resource class, etc.) by Gitlab::SidekiqConfig::WorkerRouter (lib/gitlab/sidekiq_config/worker_router.rb).

GitLab.com runs ~70 different cluster shards, each with a different selector, sized for its job mix. Self-managed installations typically run a single * cluster.

Worker DSL

module Projects
  class CleanupWorker
    include ApplicationWorker

    feature_category :groups_and_projects
    urgency :low
    idempotent!
    deduplicate :until_executed
    data_consistency :delayed
    concurrency_limit -> { 5 }

    def perform(project_id)
      project = Project.find(project_id)
      Projects::CleanupService.new(project).execute
    end
  end
end
Annotation Effect
feature_category Required; routes to the right cluster, tags logs, drives SLIs
urgency :high (10s SLA) / :medium (5min) / :low (1h)
idempotent! Required for new workers; declares retries are safe
deduplicate :until_executed, :until_executing, :none — drop duplicate scheduled jobs
data_consistency :always (primary), :sticky, :delayed (replica)
concurrency_limit Adaptive throttle on the worker
worker_resources :cpu_bound Hints scheduling onto bigger nodes
loggable_arguments Whitelist of args safe to log (PII filter)

Middleware stack

Sidekiq has a custom server-side middleware stack assembled in lib/gitlab/sidekiq_middleware/:

  • RequestStoreMiddleware — per-job request-store cleanup.
  • ApplicationContextMiddleware — propagates Gitlab::ApplicationContext.
  • Metrics — Prometheus instrumentation.
  • MonitoringMiddleware, MemoryKiller, IsolatedFromRequest — guardrails.
  • WorkerContextMiddleware — domain-specific tags.
  • DuplicateJobs — Redis-based deduplication.
  • Throttling::Server — adaptive concurrency limits.
  • Labkit::Sidekiq::ServerMiddleware — observability.

Cron jobs

Long-running scheduled jobs are declared via sidekiq-cron in config/initializers/1_settings.rb and config/sidekiq_cron_jobs.yml (loaded by Gitlab::SidekiqConfig::CronJobs).

Examples:

  • Cron::Cleanup::*Worker — cleanup of dormant projects, expired tokens, etc.
  • Cron::PartitionManagementWorker — Postgres time-based partitioning.
  • Cron::PipelinesArchiveWorker — CI archive sweep.

Concurrency limit service

app/workers/concurrency_limit/ implements a Redis-coordinated throttle:

  • A worker's concurrency_limit lambda computes the cap.
  • Jobs above the cap are deferred to a Redis-sorted set.
  • A reschedule worker periodically recovers them.

Memory and resource control

lib/gitlab/sidekiq_middleware/memory_killer.rb and Gitlab::Memory::Watchdog (lib/gitlab/memory/watchdog/) terminate processes that exceed RSS limits, and the supervisor restarts them.

Routing examples

Scenario Selector
All low-urgency workers urgency=low
All CI workers feature_category=continuous_integration
Workers safe for replicas data_consistency=delayed
Geo replication workers feature_category=geo_replication

The full router logic is in Gitlab::SidekiqConfig::WorkerRouter.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Sidekiq cluster – GitLab wiki | Factory