Open-Source Wikis

/

DuckDB

/

Features

/

SQL frontend

duckdb/duckdb

SQL frontend

DuckDB's SQL dialect, types, and the path a query takes from text to AST. Implementation is in systems/parser, systems/planner, and systems/optimizer.

Dialect highlights

DuckDB markets itself as having "Friendly SQL". A few notable features beyond standard SQL:

Feature Example
SELECT * EXCLUDE (col), REPLACE (a AS b) Drop or rename columns from *
Trailing commas SELECT a, b, FROM t
FROM-first FROM t SELECT a, b
GROUP BY ALL SELECT a, b, sum(c) FROM t GROUP BY ALL
ORDER BY ALL Order by every output column.
SELECT DISTINCT ON (col) Postgres-style.
List/struct/map literals [1, 2, 3], {a: 1, b: 'x'}, MAP {1: 'a', 2: 'b'}
List comprehensions [x*2 for x in [1,2,3] if x > 1]
Lambda functions list_transform([1,2,3], x -> x*2)
Asterisk expansion in COUNT COUNT(* EXCLUDE (col))
Range and generate_series range(1, 10), generate_series('2024-01-01'::date, '2024-12-01'::date, INTERVAL 1 MONTH)
Nested aggregates with FILTER SUM(x) FILTER (WHERE x > 0)
Pivot / unpivot PIVOT t ON year USING sum(amount), UNPIVOT t ON ...
Native nested types LIST<T>, STRUCT<...>, MAP<K, V>, UNION
Nested subscripts mylist[1], mystruct.a, mymap['key']
WITH RECURSIVE and WITH ... AS MATERIALIZED Standard plus DuckDB hints.
QUALIFY Filter on window-function outputs without a subquery.

The grammar lives in src/parser/peg/. The DuckDB user manual at https://duckdb.org/docs/current/sql/ is the authoritative reference.

Type system

DuckDB has a richer type system than most embedded engines:

  • Numeric: BOOLEAN, TINYINT/UTINYINT, SMALLINT/USMALLINT, INTEGER/UINTEGER, BIGINT/UBIGINT, HUGEINT/UHUGEINT (128-bit), FLOAT, DOUBLE, DECIMAL(p, s).
  • Temporal: DATE, TIME, TIME WITH TIME ZONE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, INTERVAL.
  • String / blob: VARCHAR (no length limit), BLOB, BIT, JSON (with extension), UUID.
  • Nested: LIST<T>, STRUCT<f1 T1, f2 T2, ...>, MAP<K, V>, UNION(name T, ...), ARRAY<T, N> (fixed-size).
  • Spatial / geometry: GEOMETRY (with the spatial extension), BIGNUM, VARINT.
  • Variant: VARIANT — a self-describing dynamic type used by from_json and Parquet shredding.

LogicalType (src/include/duckdb/common/types.hpp) is the runtime representation. enum_util.cpp contains the generated string conversions for type IDs.

Casting and promotion

src/function/cast_rules.cpp defines the implicit-cast cost matrix:

  • Exact match: 0 cost.
  • Lossless implicit cast (INTBIGINT, INTDOUBLE): low cost.
  • Lossy implicit cast (DOUBLEINT): high cost.
  • No cast: not allowed implicitly.

FunctionBinder (src/function/function_binder.cpp) uses these costs to pick the cheapest function overload during binding.

Explicit casts use CAST(x AS T) or the x::T shorthand. Cast bodies are in src/function/cast/ and the bound representation is BoundCastExpression.

Identifier resolution

Bare name Resolution
t Try each schema in search_path.
s.t Look up table t in schema s.
db.s.t Cross-database: look up table t in schema s of database db.
t.col Column col of table t in current scope.

SET search_path = 'main, my_schema' configures the search path. Implementation: src/catalog/catalog_search_path.cpp. Cross-database resolution goes through DatabaseManager (src/main/database_manager.cpp).

Prepared statements and parameters

PREPARE q AS SELECT * FROM t WHERE id = $1;
EXECUTE q(42);

Or via the C/C++ API:

auto stmt = conn.Prepare("SELECT * FROM t WHERE id = $1");
auto result = stmt->Execute(42);

Implementation: src/main/prepared_statement.cpp, src/main/prepared_statement_data.cpp. The bound plan is cached; EXECUTE only re-runs the parameter binding and the executor.

Subqueries and decorrelation

Correlated subqueries (SELECT ... WHERE EXISTS (SELECT 1 FROM ... WHERE outer.x = inner.y)) are decorrelated by src/planner/subquery/. The planner emits delim-join nodes that the optimizer's Deliminator pass eliminates.

Window functions

OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE BETWEEN ...) is supported with all standard frames (RANGE, ROWS, GROUPS), exclusion clauses (EXCLUDE CURRENT ROW, EXCLUDE GROUP, etc.), and the QUALIFY clause for filtering on window outputs.

Window-function infrastructure lives in src/function/window/ and src/execution/operator/projection/physical_window.cpp.

EXPLAIN and profiling

EXPLAIN SELECT ... ;          -- print physical plan
EXPLAIN ANALYZE SELECT ...;   -- run + print plan with timings
PRAGMA enable_profiling = 'json';

Plan rendering uses src/common/tree_renderer/. Profile collection is src/main/query_profiler.cpp.

Where to look

  • Grammar: src/parser/peg/ and the README there.
  • AST classes: src/parser/expression/, src/parser/statement/.
  • Binder: src/planner/binder.cpp.
  • Optimizer pipeline: src/optimizer/optimizer.cpp.
  • Function library: extensions/core-functions.
  • Built-in datetime functions with timezone awareness: extensions/icu.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

SQL frontend – DuckDB wiki | Factory