duckdb/duckdb
Patterns and conventions
The full coding rules are in CONTRIBUTING.md and CLAUDE.md. This page surfaces the patterns that come up daily.
C++ style
- C++17, tabs for indentation, spaces for alignment, 120-column line limit.
- Use
[u]int(8|16|32|64)_tfor integer widths. Never useint,long, orunsigned. - Use
idx_t(a project-wide unsigned 64-bit alias) for offsets, indices, and counts. Notsize_t. - Use
D_ASSERT(...)to express programmer invariants — never user-input checks. - Use C++11 range-based
forloops. - Always brace
ifand loop bodies, even single-line. - Never use
const_cast. TheFileBuffer::GetData/GetDataMutablesplit (commitaec1efc176) is the standard pattern when you need both read and mutable access. - Do not introduce
using namespacedirectives. Everything insrc/lives innamespace duckdb. - Keep
// commented-out codeout of PRs.
Memory management
- Smart pointers only:
unique_ptr<T>for ownership,shared_ptr<T>only when ownership is genuinely shared. optional_ptr<T>(src/include/duckdb/common/optional_ptr.hpp) for nullable references.reference<T>(src/include/duckdb/common/reference_map.hppandcommon/types.hpp) for non-nullable references. Preferreference<T>over raw pointers.- Never call
newordeletedirectly. TheAllocatorabstraction (src/common/allocator.cpp) is used wherever placement-new with an explicit pool is required. - For arena-style allocations, use
ArenaAllocator(src/storage/arena_allocator.cpp).
Naming
| Kind | Convention | Example |
|---|---|---|
| Files | snake_case.cpp / .hpp |
abstract_operator.cpp |
| Types (class, struct, enum, alias) | PascalCase |
LogicalOperator, ColumnBinding |
| Functions / methods | PascalCase |
GetChunk, ResolveType |
| Variables, members, parameters | snake_case |
chunk_size, current_thread |
| Loop indices | descriptive (column_idx, row_idx); i is allowed only in non-nested loops |
for (idx_t row_idx = 0; row_idx < count; row_idx++) { |
clang-tidy enforces a subset of these.
Class layout
class MyClass {
public:
MyClass();
int my_public_variable;
public:
void MyFunction();
private:
void MyPrivateFunction();
private:
int my_private_variable;
};Public state is a public block, then public methods, then private methods, then private state. Existing files follow this strictly; reviewers will ask for it.
Error handling
- Exceptions are reserved for query-fatal errors: parser errors, catalog lookups, type mismatches, out-of-memory, abort. Throw the most specific exception type (
BinderException,CatalogException,ConversionException,IOException, etc., declared insrc/include/duckdb/common/exception/). - For "expected, recoverable" errors inside a query, return a value or set an
ErrorData(src/include/duckdb/common/error_data.hpp). - Use
D_ASSERT(src/include/duckdb/common/assert.hpp) for invariants. Add a comment explaining what is being asserted; an assert without context is a smell. - For stable user-facing error messages, route them through
ErrorManager(src/main/error_manager.cpp) so they can be customized per deployment.
if (!type.IsNumeric()) {
throw BinderException("Function 'sum' requires a numeric argument, got %s", type.ToString());
}
D_ASSERT(state.IsInitialized()); // pipelines initialize before any chunk is producedReturning early
Prefer early returns to nested branches:
if (!table) {
return ErrorData("Table not found");
}
if (!table->HasColumn(name)) {
return ErrorData(StringUtil::Format("Column '%s' not found", name));
}
return BindColumn(*table, name);Visitor pattern
The codebase uses several visitors. Use them rather than dynamic_cast chains:
| Visitor | File | Purpose |
|---|---|---|
LogicalOperatorVisitor |
src/include/duckdb/planner/logical_operator_visitor.hpp |
Walk a logical plan, optionally rewriting expressions. |
ExpressionIterator |
src/include/duckdb/planner/expression_iterator.hpp |
Walk a bound expression tree. |
ParsedExpressionIterator |
src/include/duckdb/parser/parsed_expression_iterator.hpp |
Walk an unbound expression tree. |
BoundNodeVisitor, OperatorVisitor |
src/planner/binder/ |
Visit bound query nodes. |
When you write a new pass, prefer subclassing one of these.
Factory pattern
Polymorphic deserialization uses a static Deserialize method per class:
class LogicalProjection : public LogicalOperator {
public:
static unique_ptr<LogicalOperator> Deserialize(Deserializer &deserializer);
void Serialize(Serializer &serializer) const override;
...
};The scripts/generate_serialization.py generator emits the dispatch tables from JSON specs in src/include/duckdb/storage/serialization/.
Vectorized execution patterns
- Operate on
Vectorobjects, notValueobjects.Valueis a slow path used at the API boundary. Vector::Flattenupgrades constant/dictionary vectors to flat. Prefer dispatch onVectorTypefor hot loops:if (vector.GetVectorType() == VectorType::CONSTANT_VECTOR) { ... }- Use
UnifiedVectorFormat(src/include/duckdb/common/types/vector.hpp) when you must read across vector encodings. - Validity is encoded as a
ValidityMask(src/include/duckdb/common/types/validity_mask.hpp). Always honor it. - Use the helpers in
src/common/vector_operations/(UnaryExecutor,BinaryExecutor,TernaryExecutor,GenericExecutor) when implementing scalar functions. - For aggregate functions, plug into the framework in
src/include/duckdb/function/aggregate_function.hpp.
Cross-cutting
- The single source of truth for SQL types is
LogicalType(src/include/duckdb/common/types.hpp). - Time/date arithmetic uses
Date,Time,Timestamp,Intervaltypes insrc/common/types/. - Strings are
string_t(src/include/duckdb/common/types/string_type.hpp) — a length-prefixed slice that fits inline up to 12 bytes. - For file system access, always go through
FileSystem(src/include/duckdb/common/file_system.hpp) rather than POSIX directly — this lets extensions (httpfs, S3, encryption) substitute their own.
Cross-references
- Errors and assertions: debugging.
- Tests for new patterns: testing.
- The pipeline these patterns operate inside: overview/architecture.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.