gitlab-org/gitlab
Sidekiq jobs
Patterns and conventions for the ~170 worker namespaces (app/workers/, ee/app/workers/) and the rules that keep the Sidekiq cluster healthy.
ApplicationWorker concern
Every worker includes ApplicationWorker (app/workers/concerns/application_worker.rb). It bundles:
Gitlab::Loggablefor structured logging.Gitlab::SidekiqMiddleware::*Concernsfor routing, observability, deduplication.WorkerAttributes— the metadata DSL.WorkerContext— context propagation.- Default settings (logging, retry behavior).
Required metadata
class FooWorker
include ApplicationWorker
feature_category :continuous_integration
urgency :low
idempotent!
deduplicate :until_executed
data_consistency :delayed
worker_resources :cpu_bound
loggable_arguments 0, 1
def perform(project_id, ref)
# ...
end
end| Metadata | Required | What it does |
|---|---|---|
feature_category |
Yes | Routes the worker to the right cluster shard, tags logs and metrics |
urgency |
Yes | :high (10s SLA), :medium (5min), :low (1h) |
idempotent! |
New workers | Declares the job is safe to retry |
deduplicate |
Recommended | :until_executed, :until_executing, :none |
data_consistency |
If reading from DB | :always (primary), :sticky, :delayed (replica) |
worker_resources |
Optional | Hint: :cpu_bound, :memory_bound |
loggable_arguments |
Optional | Whitelist of args safe to log |
concurrency_limit |
Hot workers | Adaptive throttle |
The cops in rubocop/cop/sidekiq/ and rubocop/cop/scalability/ enforce these.
Idempotency
Idempotent workers can be retried without side effects. Convention: an idempotent worker checks state at the top of perform:
def perform(project_id)
project = Project.find_by(id: project_id)
return if project.nil?
return if project.cleanup_in_progress? # already enqueued or running
Projects::CleanupService.new(project).execute
endThe cop Sidekiq/IdempotentWorker flags missing idempotent!.
Deduplication
Sidekiq middleware (lib/gitlab/sidekiq_middleware/duplicate_jobs/) uses a Redis cookie keyed by (class, args) to drop duplicates:
:until_executing— drop while a duplicate is queued.:until_executed— drop while a duplicate is queued or running.
This prevents "thundering herd" scenarios when the same project triggers many MR-update jobs in a row.
Concurrency limits
app/workers/concurrency_limit/ provides Redis-coordinated throttles. A worker declares a lambda that returns the cap:
concurrency_limit -> { Gitlab::CurrentSettings.foo_concurrency_limit }When over the cap, jobs are deferred to a sorted set; a recovery worker reschedules them.
Cron jobs
Long-running scheduled jobs are declared in config/initializers/1_settings.rb and config/sidekiq_cron_jobs.yml. Pattern:
class Cron::SomeCleanupWorker
include ApplicationWorker
include CronjobQueue
feature_category :continuous_integration
def perform
# cleanup
end
endCronjobQueue routes the job to the dedicated cron queue.
Domain events via EventStore
The preferred way to fan out work to multiple subscribers is the EventStore:
# Publisher
Gitlab::EventStore.publish(
Projects::ProjectCreatedEvent.new(data: { project_id: project.id })
)
# Subscriber
class FooWorker
include ApplicationWorker
include Gitlab::EventStore::Subscriber
feature_category :foo
def handle_event(event)
# event.data is validated against the event's JSON schema
end
endEventStore subscribers are wired in config/initializers/event_store.rb.
Testing workers
RSpec.describe FooWorker, feature_category: :foo do
let(:project) { create(:project) }
it 'is idempotent' do
expect { described_class.new.perform(project.id) }.not_to raise_error
expect { described_class.new.perform(project.id) }.not_to change { project.reload.foo_count }
end
endFor idempotency assertions, use the :idempotent shared example:
include_examples 'an idempotent worker' do
let(:job_args) { project.id }
endRouting in production
GitLab.com runs ~70 cluster shards. The router (lib/gitlab/sidekiq_config/worker_router.rb) reads the worker's metadata and routes it to a matching shard. Self-managed installations usually run a single * cluster.
Where to make changes
- New worker:
app/workers/<area>/<verb>_worker.rbwith all metadata declared. - Add to
config/sidekiq_queues.ymlif you're using a new queue namespace. - Run
bin/rake gitlab:sidekiq:all_queues_yaml:generateto refreshapp/workers/all_queues.yml. - For very long jobs, split work via
BatchedBackgroundMigration(see Database) or via paginated re-enqueuing.
Related
- Sidekiq cluster — runtime topology.
- EventStore — pub/sub.
- Database —
data_consistencysemantics.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.