duckdb/duckdb
Execution
Active contributors: Pedro Holanda, Mark, Laurens Kuiper
Purpose
src/execution/ lowers an optimized logical plan to a PhysicalOperator tree and provides the operator implementations (scans, filters, joins, aggregates, set ops, sorts, window functions). It also implements the vectorized expression executor that evaluates bound expressions over DataChunks. Pipelining and threading live next door in parallel.
Directory layout
src/execution/
├── physical_plan_generator.cpp LogicalOperator -> PhysicalOperator
├── physical_operator.cpp Base class for physical operators
├── physical_plan/ Logical -> physical translation per operator type
├── operator/ Physical operator implementations
│ ├── scan/ Table, parquet, csv, arrow, columnar, range scans
│ ├── join/ Hash, piecewise merge, IE, nested-loop joins
│ ├── aggregate/ Hash, partitioned, perfect, distinct
│ ├── filter/ Pushed-down filters
│ ├── projection/ Projections, table-in-out, unnest
│ ├── order/ Order, top-n, sort by group
│ ├── persistent/ COPY/Insert/Update/Delete + Append-state
│ ├── csv_scanner/ Parallel CSV scanner machinery
│ ├── set/ UNION / INTERSECT / EXCEPT
│ ├── helper/ Limit, sample, expression scan
│ └── schema/ DDL: create/drop/alter
├── expression_executor.cpp Vectorized evaluation of bound expressions
├── expression_executor/ Per-expression-class evaluation paths
├── expression_executor_state.cpp Per-thread, per-expression scratch state
├── column_binding_resolver.cpp ColumnBinding -> physical column index
├── join_hashtable.cpp Hash-join hash table
├── aggregate_hashtable.cpp Hash-aggregate hash table
├── perfect_aggregate_hashtable.cpp No-collision aggregate hash table
├── radix_partitioned_hashtable.cpp Partitioned aggregate hash table
├── adaptive_filter.cpp Adaptive filter ordering at runtime
├── nested_loop_join/ Nested-loop join helpers
├── index/ ART index runtime
└── sample/ Reservoir / Bernoulli samplingKey abstractions
| Type | File | Role |
|---|---|---|
PhysicalPlanGenerator |
src/execution/physical_plan_generator.cpp |
Drives the logical-to-physical translation; visits each LogicalOperator and dispatches to a Plan* method. |
PhysicalOperator |
src/include/duckdb/execution/physical_operator.hpp |
Base of the physical plan. Subclasses include PhysicalTableScan, PhysicalHashJoin, PhysicalHashAggregate, PhysicalProjection, PhysicalOrder, PhysicalTopN, PhysicalLimit, PhysicalCopyToFile, etc. |
OperatorState |
src/include/duckdb/execution/operator/... |
Per-execution thread state for an operator. Sources have LocalSourceState/GlobalSourceState; sinks have LocalSinkState/GlobalSinkState. |
ExpressionExecutor |
src/execution/expression_executor.cpp |
Evaluates a vector of bound expressions over a DataChunk. |
JoinHashTable |
src/execution/join_hashtable.cpp |
Build-side hash table for PhysicalHashJoin. |
RadixPartitionedHashTable |
src/execution/radix_partitioned_hashtable.cpp |
Aggregate hash table that radix-partitions data for parallelism. |
ColumnBindingResolver |
src/execution/column_binding_resolver.cpp |
Converts ColumnBinding(table_index, column_index) references into physical column indices in BoundReferenceExpression. |
How it works
graph LR
Logical[Optimized LogicalOperator] -->|PhysicalPlanGenerator::Plan*| Physical[PhysicalOperator tree]
Physical -->|MetaPipeline::Build| Pipelines[Pipeline graph]
Pipelines --> Scheduler[TaskScheduler runs pipeline tasks]
Scheduler --> Chunks[DataChunks flow operator-to-operator]Operator interface
Each physical operator can play three roles:
- Source. Produces chunks via
GetData(execution_context, chunk, source_state). Examples:PhysicalTableScan,PhysicalChunkScan. - Operator (intermediate). Transforms chunks via
Execute(context, input, chunk, gstate, lstate). Examples:PhysicalProjection,PhysicalFilter. - Sink. Consumes chunks via
Sink(context, chunk, gstate, lstate), then exposes results once finished. Examples: hash-join build, hash-aggregate, sort.
Operators advertise their roles via IsSource, IsSink, and ParallelSource/ParallelSink flags. The pipelining layer in parallel wires sources, intermediates, and sinks into pipelines based on these flags.
Expression evaluation
ExpressionExecutor takes a vector of bound expressions (one per output column) and evaluates them over an input DataChunk. It maintains per-expression scratch state and uses Vector helpers to fast-path constant and dictionary inputs.
Concrete dispatch lives in src/execution/expression_executor/:
execute_function.cpp— scalar function calls.execute_case.cpp—CASE WHENexpressions.execute_conjunction.cpp—AND/OR.execute_comparison.cpp—=,<, etc.execute_constant.cpp,execute_reference.cpp,execute_parameter.cpp,execute_cast.cpp,execute_between.cpp.
Hash join
PhysicalHashJoin is the workhorse for equi-joins. The build side fills a JoinHashTable (one per executing thread, then merged); the probe side performs a vectorized scan of the table for each probe chunk. Variants in src/execution/operator/join/:
physical_hash_join.cpp— main implementation.physical_perfect_hash_join.cpp— used when the build keys are dense small integers.physical_iejoin.cpp— inequality joins.physical_piecewise_merge_join.cpp— sort-merge for non-equi joins on sorted inputs.physical_nested_loop_join.cpp— fallback for unhandled cases.
Hash aggregate
PhysicalHashAggregate partitions inputs by hash and builds per-partition aggregate hash tables. The RadixPartitionedHashTable (radix_partitioned_hashtable.cpp) is the production path; PerfectAggregateHashTable is used when the GROUP BY keys are small bounded integers.
Adaptive filter
AdaptiveFilter (adaptive_filter.cpp) runs a small number of filter conjuncts in different orders during execution and settles on the cheapest order observed — a runtime alternative to compile-time selectivity heuristics.
Integration points
- Input: Optimized
LogicalOperatorfrom optimizer. - Output: A
PhysicalOperatortree that parallel wraps in pipelines. - Catalog: Scans and DDL operators look up
TableCatalogEntryand friends from catalog. - Storage: Scans, inserts, updates, deletes go through
DataTable(src/storage/data_table.cpp) andLocalStorage(src/storage/local_storage.cpp). - Functions: Operators call into function for scalar/aggregate/table function bodies.
Entry points for modification
- Adding a new physical operator: subclass
PhysicalOperator, provide source/operator/sink methods as needed, and add aPlan*method toPhysicalPlanGeneratorto construct it from a logical operator. - Adding a new join algorithm: add to
src/execution/operator/join/, register in the join planner code path, and add tests intest/sql/join/. - Adjusting hashtable sizing/partitioning: see
radix_partitioned_hashtable.cppandaggregate_hashtable.cpp. Changes there interact withpartitioned_execution.cppin the optimizer. - Tweaking expression evaluation: per-expression dispatch lives in
expression_executor/.
Key source files
| File | Purpose |
|---|---|
src/execution/physical_plan_generator.cpp |
Logical-to-physical lowering. |
src/execution/physical_operator.cpp |
Base class semantics. |
src/execution/expression_executor.cpp |
Evaluates expressions over chunks. |
src/execution/join_hashtable.cpp |
Hash join's hash table. |
src/execution/radix_partitioned_hashtable.cpp |
Partitioned hash aggregate. |
src/execution/operator/scan/physical_table_scan.cpp |
Table scan. |
src/execution/operator/aggregate/physical_hash_aggregate.cpp |
Hash aggregate. |
src/execution/operator/join/physical_hash_join.cpp |
Hash join. |
src/execution/operator/persistent/physical_insert.cpp |
INSERT. |
src/execution/index/art/art.cpp |
Adaptive Radix Tree index runtime. |
Continue to parallel for how the operator tree becomes a pipeline graph and gets executed.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.