duckdb/duckdb
Common
Active contributors: Mytherin, Mark, Laurens Kuiper
Purpose
src/common/ is the shared library that every other subsystem depends on. It owns the type system (LogicalType, Vector, DataChunk, Value), the file system abstraction, allocators, error infrastructure, serialization helpers, and rendering utilities. It is the largest single subsystem (~85k lines) and usually the right place to look when you ask "where is X defined?"
Directory layout
src/common/
├── types.cpp LogicalType definitions
├── exception.cpp Exception hierarchy
├── exception_format_value.cpp Argument formatting for exceptions
├── error_data.cpp Recoverable-error payload
├── string_util.cpp String helpers (StringUtil::*)
├── file_system.cpp FileSystem base class + utilities
├── local_file_system.cpp POSIX/Windows file system
├── virtual_file_system.cpp Composite file system used by extensions
├── opener_file_system.cpp Wraps a FileOpener (per-connection routing)
├── compressed_file_system.cpp Adapter for gzip/zstd readers
├── gzip_file_system.cpp Gzip-specific readers
├── pipe_file_system.cpp Streams over POSIX pipes
├── file_buffer.cpp Aligned read/write buffer
├── filename_pattern.cpp Glob expansion for file names
├── path.cpp Path manipulation helpers
├── hive_partitioning.cpp Hive-style partition path parsing
├── allocator.cpp Pluggable allocator
├── checksum.cpp CRC32C
├── re2_regex.cpp Regex helpers built on RE2
├── radix_partitioning.cpp Hash partitioning helper
├── random_engine.cpp PCG-based RNG wrapper
├── thread_util.cpp Thread helpers (yield, set name, …)
├── stacktrace.cpp Best-effort stack traces
├── windows_util.cpp Windows shims
├── cgroups.cpp Reads cgroup memory limits
├── serialization_compatibility.cpp Storage compatibility checks
├── render_tree.cpp Generic tree renderer
├── tree_renderer/ Tree renderers (HTML, JSON, text)
├── box_renderer.cpp Tabular result renderer (used by CLI / EXPLAIN)
├── box_renderer_context.cpp, client_box_renderer_context.cpp
├── csv_writer.cpp CSV output
├── encryption_*.cpp Encryption primitives
├── enum_util.cpp Generated enum<->string utilities
├── extra_type_info.cpp Type metadata helpers
├── settings.json Source of truth for all settings (codegen input)
├── crypto/ AES helpers
├── enums/ Per-enum helpers
├── exception/ Exception subclasses
├── multi_file/ Multi-file readers (used by csv/parquet)
├── operator/ Generic templated operators (numeric_cast, comparison, etc.)
├── progress_bar/ CLI progress bars
├── row_operations/ Row-format helpers
├── serializer/ Binary + JSON serializers
├── sort/ External sort algorithms
├── types/ Concrete LogicalType helpers
├── value_operations/ Operations on Value objects
├── vector/ Vector helpers
├── vector_operations/ Templated executor framework (Unary/Binary/...)
└── arrow/ Arrow C Data Interface adaptersKey abstractions
| Type | File | Role |
|---|---|---|
Vector |
src/include/duckdb/common/types/vector.hpp |
Columnar buffer with type, validity, and encoding. |
DataChunk |
src/include/duckdb/common/types/data_chunk.hpp |
A row of Vectors sharing a cardinality. |
Value |
src/include/duckdb/common/types/value.hpp |
Heap-allocated scalar. Used at API boundaries. |
LogicalType |
src/include/duckdb/common/types.hpp |
Type ID + width + child types. |
FileSystem |
src/include/duckdb/common/file_system.hpp |
I/O abstraction. Implementations: LocalFileSystem, VirtualFileSystem, OpenerFileSystem, plus extensions like httpfs. |
Allocator |
src/common/allocator.cpp |
Pluggable allocator. Used by per-database memory accounting. |
Exception |
src/include/duckdb/common/exception.hpp |
Base for all DuckDB exceptions. Subclasses are categorized in src/include/duckdb/common/exception/. |
ErrorData |
src/include/duckdb/common/error_data.hpp |
Recoverable error payload returned by APIs that don't throw. |
Serializer, Deserializer |
src/common/serializer/ |
Binary and JSON serializer interfaces used by storage and plan serialization. |
Sort |
src/common/sort/ |
External merge sort used by PhysicalOrder. |
BoxRenderer |
src/common/box_renderer.cpp |
Pretty-prints a result chunk as an ASCII/Unicode table; used by the CLI and EXPLAIN. |
Vector encodings
Vector supports several encodings:
| Encoding | When | Effect |
|---|---|---|
| Flat | Default | One value per slot. |
| Constant | All values are the same | Single value, replicated logically. |
| Dictionary | Few distinct values | Index buffer + child vector with the unique values. |
| Sequence | Values are start + i * step |
Two scalars represent the whole vector. |
| FSST | String compression | Compressed strings with shared symbol table. |
Conversion utilities in vector.hpp and vector_operations/ (Vector::Flatten, UnifiedVectorFormat) let consumers handle all encodings uniformly without writing four code paths.
Templated executors
For scalar function authors, src/common/vector_operations/ provides:
UnaryExecutor::Execute<TA, TR>— one input vector, one output vector, with a kernel functor.BinaryExecutor::Execute<TA, TB, TR>— two input vectors.TernaryExecutor::Execute<TA, TB, TC, TR>— three input vectors.GenericExecutor— variadic.
These templates handle constant/dictionary fast paths, validity propagation, and chunk sizing. Most scalar functions in src/function/scalar/ and extension/core_functions/scalar/ use them.
File system
FileSystem is the single I/O surface. Local I/O is implemented in local_file_system.cpp. Extensions (httpfs, aws, encryption) register their own FileSystem subclass into a VirtualFileSystem. Per-connection routing happens through OpenerFileSystem, which consults a FileOpener registered in the ClientContext.
graph LR
User[CSV / Parquet / Arrow scan] -->|Open| VFS[VirtualFileSystem]
VFS -->|local path| Local[LocalFileSystem]
VFS -->|http(s)://| HTTPFS[httpfs extension]
VFS -->|s3://| S3[AWS extension]
VFS -->|encrypted| Enc[Encryption extension]Settings
src/common/settings.json is the source of truth for engine settings. scripts/generate_settings.py reads it and generates per-setting structs in src/main/settings/ and registration code. Editing settings goes through this file plus make generate-files.
Enums
src/common/enum_util.cpp is generated from src/include/duckdb/common/enums/. It provides EnumUtil::ToString / EnumUtil::FromString for every public enum, used in serialization and error messages.
Integration points
- Every other subsystem
#includes headers fromsrc/include/duckdb/common/. - Storage uses
Allocator,FileSystem,Serializer, the box renderer for catalog dumps. - Execution uses
Vector,DataChunk,vector_operations/,sort/,row_operations/. - The CLI uses
BoxRenderer,tree_renderer/,progress_bar/. - Extensions subclass
FileSystem, register casts, and use the sameVectorAPI as the engine.
Entry points for modification
- Adding a new
LogicalTypeID: editsrc/include/duckdb/common/types.hpp, the enum insrc/common/enums/, and add helper code insrc/common/types/. - Adding a
FileSystemimplementation: subclassFileSystemand register throughVirtualFileSystem(typically from an extension). - Adding a templated executor variant: see
src/common/vector_operations/. - Adding a tree-renderer output format: subclass
TreeRendererinsrc/common/tree_renderer/. - Allocator integration:
Allocator::DefaultAllocatoris set per-database viaDBConfigin main.
Key source files
| File | Purpose |
|---|---|
src/common/types.cpp |
LogicalType core. |
src/common/types/vector.cpp |
Vector core. |
src/common/file_system.cpp |
I/O abstraction. |
src/common/local_file_system.cpp |
POSIX/Windows file I/O. |
src/common/allocator.cpp |
Allocator. |
src/common/exception.cpp |
Exception hierarchy. |
src/common/string_util.cpp |
String helpers. |
src/common/serializer/ |
Binary + JSON serializers. |
src/common/sort/ |
External sort. |
src/common/box_renderer.cpp |
Result table renderer. |
src/common/enum_util.cpp |
Generated enum<->string. |
src/common/settings.json |
Settings spec. |
Cross-references: every other systems page links back here for primitives.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.