Open-Source Wikis

/

DuckDB

/

How to contribute

/

Tooling

duckdb/duckdb

Tooling

The build, format, and codegen scripts that make up the DuckDB development environment.

Build system

DuckDB has both a Makefile (the recommended entry point) and CMakeLists.txt (the underlying generator). The Makefile is a thin wrapper that picks sensible defaults and invokes CMake.

Target Effect
make Optimized release build.
make debug Unoptimized + asserts + sanitizers.
make reldebug -O2 + asserts + debug symbols.
make relassert (with FORCE_DEBUG=1) -O2 + sanitizers + asserts.
make unit Run fast tests.
make allunit Run full test suite, including .test_slow.
make format-fix Auto-format C++ (clang-format) and Python (black).
make tidy-check Run clang-tidy over the diff.
make generate-files Regenerate generated files.
make clean Clean build artefacts.

Useful environment knobs:

Variable Purpose
GEN=ninja Use Ninja for parallel builds.
CMAKE_BUILD_PARALLEL_LEVEL=N Limit parallel build jobs.
BUILD_ALL_EXT=1 Build every in-tree extension.
DUCKDB_EXTENSIONS=... Build only the listed extensions.
BUILD_BENCHMARK=1 Build the benchmark runner.
BUILD_TPCH=1, BUILD_TPCDS=1 Include the corresponding data generators.
DISABLE_SANITIZER=1 Build debug without sanitizers.
BUILD_SHELL=1 (default in most builds) Build the CLI.

The full set of variables is documented in Makefile.

Formatting

make format-fix runs clang-format (pinned to 11.0.1) and black (Python) over the diff. The driver is scripts/format.py.

Install the right clang-format:

pipx install clang-format==11.0.1
# or
python3 -m pip install --user clang-format==11.0.1

CI runs scripts/format.py --check and rejects anything that does not match.

Linting

  • clang-tidy: configured by .clang-tidy. Run the diff-only linter via scripts/clang-tidy-diff.py or scripts/run-clang-tidy.py.
  • clangd: the project ships .clangd for IDE integration; the LSP picks it up automatically.
  • Banned-symbols check: scripts/banned_symbols_check.py ensures forbidden symbols (malloc, new, const_cast, …) do not creep into the codebase.

Code generation

DuckDB checks generated artefacts into the repo. To regenerate them after editing inputs:

make generate-files

The generators in scripts/:

Script Output
generate_c_api.py Public C API headers (src/include/duckdb.h, src/main/capi/). Sources: JSON specs in src/include/duckdb/main/capi/.
generate_serialization.py Serialize/Deserialize dispatch in src/storage/serialization/. Source: JSON in src/include/duckdb/storage/serialization/.
generate_enum_util.py src/common/enum_util.cpp (huge enum-to-string table).
generate_enums.py Enum definitions in headers.
generate_functions.py Function registration scaffolding.
generate_csv_header.py The CSV scanner uses a generated detector header.
generate_extensions_function.py Per-extension function registration glue.
generate_metric_enums.py Profiling metric enums.
generate_settings.py Settings registration from src/common/settings.json.
generate_storage_info.py Storage version metadata.
generate_storage_version.py Storage version stamp.
generate_plan_storage_version.py Plan-format version stamp.

The PEG grammar generator is scripts/build_grammar.sh (the parser README at src/parser/peg/README.md documents how to add new grammar rules).

Useful scripts

A few scripts the maintainers use frequently:

Script Purpose
scripts/amalgamation.py Build the single-file duckdb.cpp/duckdb.hpp amalgamation.
scripts/check_coverage.py, scripts/coverage_check.sh Local coverage report.
scripts/run_tests_one_by_one.py Run tests sequentially to isolate ordering bugs.
scripts/sync_out_of_tree_extensions.py Refresh out-of-tree extension submodules and patches.
scripts/test_storage_compatibility.py Verify a build can read pinned older storage versions.
scripts/test_serialization_bwc.py Verify serialized plans deserialize across versions.
scripts/test_compile.py Compile sample programs against the C API.
scripts/regression_check.py Compare benchmark results between builds.
scripts/plan_cost_runner.py Compute plan costs over a corpus, used by optimizer changes.

CI

.github/workflows/ contains 30+ workflows. The most important:

Workflow Purpose
Main.yml Linux build matrix, runs the full test suite on every PR.
OSX.yml, Windows.yml Per-platform builds.
ExtendedTests.yml Extended test surfaces (Python regressions, R, Wasm).
Extensions.yml Build/test in-tree and selected out-of-tree extensions.
NightlyTests.yml Slow tests; runs every night and on demand.
Regression.yml Performance regression detection against main.
CrossVersion.yml Storage compatibility matrix across versions.
Android.yml, Swift.yml, cifuzz.yml Mobile and fuzzing matrices.
OnTag.yml, StagedUpload.yml, _manual_extension_deploy.yml Release/publishing automation.

To trigger a workflow on your fork, push the branch and watch the Actions tab. CI tokens for protected secrets (asset upload, signing) are available only on duckdb/duckdb itself.

IDE integration

  • VS Code: install the C/C++ and CMake extensions; pick the reldebug build directory.
  • CLion / Visual Studio: open the root folder, point at CMakeLists.txt.
  • The project ships .editorconfig, .clang-format, .clang-tidy, and .clangd so most IDEs configure themselves.

For more on the development cycle, see development-workflow. For testing details, see testing.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Tooling – DuckDB wiki | Factory