Open-Source Wikis

/

DuckDB

/

Systems

/

Common

duckdb/duckdb

Common

Active contributors: Mytherin, Mark, Laurens Kuiper

Purpose

src/common/ is the shared library that every other subsystem depends on. It owns the type system (LogicalType, Vector, DataChunk, Value), the file system abstraction, allocators, error infrastructure, serialization helpers, and rendering utilities. It is the largest single subsystem (~85k lines) and usually the right place to look when you ask "where is X defined?"

Directory layout

src/common/
├── types.cpp                     LogicalType definitions
├── exception.cpp                 Exception hierarchy
├── exception_format_value.cpp    Argument formatting for exceptions
├── error_data.cpp                Recoverable-error payload
├── string_util.cpp               String helpers (StringUtil::*)
├── file_system.cpp               FileSystem base class + utilities
├── local_file_system.cpp         POSIX/Windows file system
├── virtual_file_system.cpp       Composite file system used by extensions
├── opener_file_system.cpp        Wraps a FileOpener (per-connection routing)
├── compressed_file_system.cpp    Adapter for gzip/zstd readers
├── gzip_file_system.cpp          Gzip-specific readers
├── pipe_file_system.cpp          Streams over POSIX pipes
├── file_buffer.cpp               Aligned read/write buffer
├── filename_pattern.cpp          Glob expansion for file names
├── path.cpp                      Path manipulation helpers
├── hive_partitioning.cpp         Hive-style partition path parsing
├── allocator.cpp                 Pluggable allocator
├── checksum.cpp                  CRC32C
├── re2_regex.cpp                 Regex helpers built on RE2
├── radix_partitioning.cpp        Hash partitioning helper
├── random_engine.cpp             PCG-based RNG wrapper
├── thread_util.cpp               Thread helpers (yield, set name, …)
├── stacktrace.cpp                Best-effort stack traces
├── windows_util.cpp              Windows shims
├── cgroups.cpp                   Reads cgroup memory limits
├── serialization_compatibility.cpp  Storage compatibility checks
├── render_tree.cpp               Generic tree renderer
├── tree_renderer/                Tree renderers (HTML, JSON, text)
├── box_renderer.cpp              Tabular result renderer (used by CLI / EXPLAIN)
├── box_renderer_context.cpp, client_box_renderer_context.cpp
├── csv_writer.cpp                CSV output
├── encryption_*.cpp              Encryption primitives
├── enum_util.cpp                 Generated enum<->string utilities
├── extra_type_info.cpp           Type metadata helpers
├── settings.json                 Source of truth for all settings (codegen input)
├── crypto/                       AES helpers
├── enums/                        Per-enum helpers
├── exception/                    Exception subclasses
├── multi_file/                   Multi-file readers (used by csv/parquet)
├── operator/                     Generic templated operators (numeric_cast, comparison, etc.)
├── progress_bar/                 CLI progress bars
├── row_operations/               Row-format helpers
├── serializer/                   Binary + JSON serializers
├── sort/                         External sort algorithms
├── types/                        Concrete LogicalType helpers
├── value_operations/             Operations on Value objects
├── vector/                       Vector helpers
├── vector_operations/            Templated executor framework (Unary/Binary/...)
└── arrow/                        Arrow C Data Interface adapters

Key abstractions

Type File Role
Vector src/include/duckdb/common/types/vector.hpp Columnar buffer with type, validity, and encoding.
DataChunk src/include/duckdb/common/types/data_chunk.hpp A row of Vectors sharing a cardinality.
Value src/include/duckdb/common/types/value.hpp Heap-allocated scalar. Used at API boundaries.
LogicalType src/include/duckdb/common/types.hpp Type ID + width + child types.
FileSystem src/include/duckdb/common/file_system.hpp I/O abstraction. Implementations: LocalFileSystem, VirtualFileSystem, OpenerFileSystem, plus extensions like httpfs.
Allocator src/common/allocator.cpp Pluggable allocator. Used by per-database memory accounting.
Exception src/include/duckdb/common/exception.hpp Base for all DuckDB exceptions. Subclasses are categorized in src/include/duckdb/common/exception/.
ErrorData src/include/duckdb/common/error_data.hpp Recoverable error payload returned by APIs that don't throw.
Serializer, Deserializer src/common/serializer/ Binary and JSON serializer interfaces used by storage and plan serialization.
Sort src/common/sort/ External merge sort used by PhysicalOrder.
BoxRenderer src/common/box_renderer.cpp Pretty-prints a result chunk as an ASCII/Unicode table; used by the CLI and EXPLAIN.

Vector encodings

Vector supports several encodings:

Encoding When Effect
Flat Default One value per slot.
Constant All values are the same Single value, replicated logically.
Dictionary Few distinct values Index buffer + child vector with the unique values.
Sequence Values are start + i * step Two scalars represent the whole vector.
FSST String compression Compressed strings with shared symbol table.

Conversion utilities in vector.hpp and vector_operations/ (Vector::Flatten, UnifiedVectorFormat) let consumers handle all encodings uniformly without writing four code paths.

Templated executors

For scalar function authors, src/common/vector_operations/ provides:

  • UnaryExecutor::Execute<TA, TR> — one input vector, one output vector, with a kernel functor.
  • BinaryExecutor::Execute<TA, TB, TR> — two input vectors.
  • TernaryExecutor::Execute<TA, TB, TC, TR> — three input vectors.
  • GenericExecutor — variadic.

These templates handle constant/dictionary fast paths, validity propagation, and chunk sizing. Most scalar functions in src/function/scalar/ and extension/core_functions/scalar/ use them.

File system

FileSystem is the single I/O surface. Local I/O is implemented in local_file_system.cpp. Extensions (httpfs, aws, encryption) register their own FileSystem subclass into a VirtualFileSystem. Per-connection routing happens through OpenerFileSystem, which consults a FileOpener registered in the ClientContext.

graph LR
    User[CSV / Parquet / Arrow scan] -->|Open| VFS[VirtualFileSystem]
    VFS -->|local path| Local[LocalFileSystem]
    VFS -->|http(s)://| HTTPFS[httpfs extension]
    VFS -->|s3://| S3[AWS extension]
    VFS -->|encrypted| Enc[Encryption extension]

Settings

src/common/settings.json is the source of truth for engine settings. scripts/generate_settings.py reads it and generates per-setting structs in src/main/settings/ and registration code. Editing settings goes through this file plus make generate-files.

Enums

src/common/enum_util.cpp is generated from src/include/duckdb/common/enums/. It provides EnumUtil::ToString / EnumUtil::FromString for every public enum, used in serialization and error messages.

Integration points

  • Every other subsystem #includes headers from src/include/duckdb/common/.
  • Storage uses Allocator, FileSystem, Serializer, the box renderer for catalog dumps.
  • Execution uses Vector, DataChunk, vector_operations/, sort/, row_operations/.
  • The CLI uses BoxRenderer, tree_renderer/, progress_bar/.
  • Extensions subclass FileSystem, register casts, and use the same Vector API as the engine.

Entry points for modification

  • Adding a new LogicalType ID: edit src/include/duckdb/common/types.hpp, the enum in src/common/enums/, and add helper code in src/common/types/.
  • Adding a FileSystem implementation: subclass FileSystem and register through VirtualFileSystem (typically from an extension).
  • Adding a templated executor variant: see src/common/vector_operations/.
  • Adding a tree-renderer output format: subclass TreeRenderer in src/common/tree_renderer/.
  • Allocator integration: Allocator::DefaultAllocator is set per-database via DBConfig in main.

Key source files

File Purpose
src/common/types.cpp LogicalType core.
src/common/types/vector.cpp Vector core.
src/common/file_system.cpp I/O abstraction.
src/common/local_file_system.cpp POSIX/Windows file I/O.
src/common/allocator.cpp Allocator.
src/common/exception.cpp Exception hierarchy.
src/common/string_util.cpp String helpers.
src/common/serializer/ Binary + JSON serializers.
src/common/sort/ External sort.
src/common/box_renderer.cpp Result table renderer.
src/common/enum_util.cpp Generated enum<->string.
src/common/settings.json Settings spec.

Cross-references: every other systems page links back here for primitives.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Common – DuckDB wiki | Factory