duckdb/duckdb
SQL frontend
DuckDB's SQL dialect, types, and the path a query takes from text to AST. Implementation is in systems/parser, systems/planner, and systems/optimizer.
Dialect highlights
DuckDB markets itself as having "Friendly SQL". A few notable features beyond standard SQL:
| Feature | Example |
|---|---|
SELECT * EXCLUDE (col), REPLACE (a AS b) |
Drop or rename columns from * |
| Trailing commas | SELECT a, b, FROM t |
FROM-first |
FROM t SELECT a, b |
GROUP BY ALL |
SELECT a, b, sum(c) FROM t GROUP BY ALL |
ORDER BY ALL |
Order by every output column. |
SELECT DISTINCT ON (col) |
Postgres-style. |
| List/struct/map literals | [1, 2, 3], {a: 1, b: 'x'}, MAP {1: 'a', 2: 'b'} |
| List comprehensions | [x*2 for x in [1,2,3] if x > 1] |
| Lambda functions | list_transform([1,2,3], x -> x*2) |
Asterisk expansion in COUNT |
COUNT(* EXCLUDE (col)) |
| Range and generate_series | range(1, 10), generate_series('2024-01-01'::date, '2024-12-01'::date, INTERVAL 1 MONTH) |
| Nested aggregates with FILTER | SUM(x) FILTER (WHERE x > 0) |
| Pivot / unpivot | PIVOT t ON year USING sum(amount), UNPIVOT t ON ... |
| Native nested types | LIST<T>, STRUCT<...>, MAP<K, V>, UNION |
| Nested subscripts | mylist[1], mystruct.a, mymap['key'] |
WITH RECURSIVE and WITH ... AS MATERIALIZED |
Standard plus DuckDB hints. |
QUALIFY |
Filter on window-function outputs without a subquery. |
The grammar lives in src/parser/peg/. The DuckDB user manual at https://duckdb.org/docs/current/sql/ is the authoritative reference.
Type system
DuckDB has a richer type system than most embedded engines:
- Numeric:
BOOLEAN,TINYINT/UTINYINT,SMALLINT/USMALLINT,INTEGER/UINTEGER,BIGINT/UBIGINT,HUGEINT/UHUGEINT(128-bit),FLOAT,DOUBLE,DECIMAL(p, s). - Temporal:
DATE,TIME,TIME WITH TIME ZONE,TIMESTAMP,TIMESTAMP WITH TIME ZONE,INTERVAL. - String / blob:
VARCHAR(no length limit),BLOB,BIT,JSON(with extension),UUID. - Nested:
LIST<T>,STRUCT<f1 T1, f2 T2, ...>,MAP<K, V>,UNION(name T, ...),ARRAY<T, N>(fixed-size). - Spatial / geometry:
GEOMETRY(with the spatial extension),BIGNUM,VARINT. - Variant:
VARIANT— a self-describing dynamic type used byfrom_jsonand Parquet shredding.
LogicalType (src/include/duckdb/common/types.hpp) is the runtime representation. enum_util.cpp contains the generated string conversions for type IDs.
Casting and promotion
src/function/cast_rules.cpp defines the implicit-cast cost matrix:
- Exact match: 0 cost.
- Lossless implicit cast (
INT→BIGINT,INT→DOUBLE): low cost. - Lossy implicit cast (
DOUBLE→INT): high cost. - No cast: not allowed implicitly.
FunctionBinder (src/function/function_binder.cpp) uses these costs to pick the cheapest function overload during binding.
Explicit casts use CAST(x AS T) or the x::T shorthand. Cast bodies are in src/function/cast/ and the bound representation is BoundCastExpression.
Identifier resolution
| Bare name | Resolution |
|---|---|
t |
Try each schema in search_path. |
s.t |
Look up table t in schema s. |
db.s.t |
Cross-database: look up table t in schema s of database db. |
t.col |
Column col of table t in current scope. |
SET search_path = 'main, my_schema' configures the search path. Implementation: src/catalog/catalog_search_path.cpp. Cross-database resolution goes through DatabaseManager (src/main/database_manager.cpp).
Prepared statements and parameters
PREPARE q AS SELECT * FROM t WHERE id = $1;
EXECUTE q(42);Or via the C/C++ API:
auto stmt = conn.Prepare("SELECT * FROM t WHERE id = $1");
auto result = stmt->Execute(42);Implementation: src/main/prepared_statement.cpp, src/main/prepared_statement_data.cpp. The bound plan is cached; EXECUTE only re-runs the parameter binding and the executor.
Subqueries and decorrelation
Correlated subqueries (SELECT ... WHERE EXISTS (SELECT 1 FROM ... WHERE outer.x = inner.y)) are decorrelated by src/planner/subquery/. The planner emits delim-join nodes that the optimizer's Deliminator pass eliminates.
Window functions
OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE BETWEEN ...) is supported with all standard frames (RANGE, ROWS, GROUPS), exclusion clauses (EXCLUDE CURRENT ROW, EXCLUDE GROUP, etc.), and the QUALIFY clause for filtering on window outputs.
Window-function infrastructure lives in src/function/window/ and src/execution/operator/projection/physical_window.cpp.
EXPLAIN and profiling
EXPLAIN SELECT ... ; -- print physical plan
EXPLAIN ANALYZE SELECT ...; -- run + print plan with timings
PRAGMA enable_profiling = 'json';Plan rendering uses src/common/tree_renderer/. Profile collection is src/main/query_profiler.cpp.
Where to look
- Grammar:
src/parser/peg/and the README there. - AST classes:
src/parser/expression/,src/parser/statement/. - Binder:
src/planner/binder.cpp. - Optimizer pipeline:
src/optimizer/optimizer.cpp. - Function library: extensions/core-functions.
- Built-in datetime functions with timezone awareness: extensions/icu.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.