Open-Source Wikis

/

DuckDB

/

Extensions

/

core_functions extension

duckdb/duckdb

core_functions extension

Purpose

extension/core_functions/ is the bundled SQL function library. It is in-tree so it is part of every default DuckDB build, but it is structured as an extension so the engine binary can be slimmed down by excluding it. It contains most of the everyday SQL functions: aggregates, list/string/date/math/blob/struct/map helpers, and the lambda-function support code.

The DuckDB engine in src/function/ only contains the function infrastructure plus a small set of "wired-into-the-language" functions (casts, comparison operators, arithmetic). Almost everything else lives here.

Directory layout

extension/core_functions/
├── core_functions_extension.cpp   Registration entry point
├── function_list.cpp               The big registration table
├── lambda_functions.cpp            Shared support for list_transform / list_filter / list_reduce
├── scalar/                         Scalar functions per domain
│   ├── string/    upper, lower, like, regexp_*, levenshtein, similarity, ...
│   ├── list/      list_filter, list_transform, list_reduce, list_sort, list_aggregate, ...
│   ├── map/       map, map_keys, map_values, map_extract, ...
│   ├── struct/    struct_pack, struct_extract, struct_insert, ...
│   ├── date/      strftime, age, century, isoyear, ...
│   ├── operator/  bitwise + arithmetic + comparison helpers
│   ├── generic/   typeof, error, alias, current_setting, ...
│   ├── system/    version, current_user, hash, can_cast_implicitly, ...
│   ├── geometry/  ST_AsText, ST_GeomFromText (WKT/WKB only - real spatial in spatial extension)
│   ├── sequence/  nextval, currval
│   ├── variant/   variant_extract, variant_typeof, ...
│   └── compressed_materialization/  helpers for the optimizer
├── aggregate/    Aggregate functions per domain (sum, avg, mode, percentile, regr_*, holistic, ...)
└── include/      Public headers

function_list.cpp is one big table that maps each function name to its implementation. To find where a function lives, search this file first.

What it provides

The number of registered functions is in the hundreds. Highlights:

Domain Examples
Scalar — string upper, lower, like, regexp_matches, regexp_extract, levenshtein, jaccard, string_split, printf, format, repeat, split_part
Scalar — list list_value, list_extract, list_concat, list_transform, list_filter, list_reduce, list_sort, list_aggregate, list_distinct, unnest
Scalar — map map, map_keys, map_values, map_extract, map_concat
Scalar — struct struct_pack, struct_extract, struct_insert, row
Scalar — date strftime, strptime, age, date_part, date_trunc (basic; ICU overrides for tz-aware), to_timestamp
Scalar — math pi, sin, cos, tan, pow, sqrt, log, exp, random, setseed, xor
Scalar — system version, current_setting, current_database, current_schema, hash
Aggregate — basic sum, avg, min, max, count, count_star, count_if, string_agg, array_agg, histogram
Aggregate — statistical stddev_pop, stddev_samp, var_pop, var_samp, corr, covar_pop, covar_samp, regr_*
Aggregate — holistic mode, quantile_cont, quantile_disc, median, mad
Aggregate — distinct count(DISTINCT ...), approx_distinct, approx_count_distinct
Aggregate — bit bit_and, bit_or, bit_xor, bitstring_agg

For the full list, run SELECT * FROM duckdb_functions() (implemented in extension/core_functions/scalar/system/).

How registration works

graph LR
    Load[ExtensionLoader::Load core_functions] --> Reg[CoreFunctionsExtension::Load]
    Reg -->|RegisterScalarFunctions| FL[function_list.cpp]
    Reg -->|RegisterAggregateFunctions| FL
    FL --> Cat[Catalog inserts ScalarFunctionCatalogEntry / AggregateFunctionCatalogEntry]
    Cat --> Bind[FunctionBinder finds them at bind time]

Each scalar/aggregate function provides a *FunctionSet factory in its .cpp file. function_list.cpp is a top-level table that lists every name and points at its factory. At extension load time, the registrar iterates this list and inserts the result into the catalog.

Lambda functions

lambda_functions.cpp provides the shared support code for the list_transform / list_filter / list_reduce / array_apply family. It defines how lambda parameters bind, how captured columns flow into the inner expression, and how to vectorize the inner evaluation.

Integration points

  • Function registry: Plugs into BuiltinFunctions via the standard extension load path.
  • Optimizer: compressed_materialization/ hosts helper functions that the optimizer introduces during plan rewrites to encode strings/decimals as ints in temp results.
  • Catalog: All registered functions become catalog entries at startup.

Entry points for modification

  • Adding a function: drop a new .cpp in the appropriate scalar/<area>/ or aggregate/<area>/ directory, declare a *FunctionSet GetFunctions() factory, and add it to function_list.cpp.
  • Adding a new domain: create a scalar/<area>/ directory with a CMakeLists.txt, add it to extension/core_functions/CMakeLists.txt.
  • Tests: test/sql/function/<area>/ matches the source layout.

Key source files

File Purpose
extension/core_functions/core_functions_extension.cpp Registration entry.
extension/core_functions/function_list.cpp Master function table.
extension/core_functions/lambda_functions.cpp Lambda support.
extension/core_functions/scalar/list/list_transform.cpp Example of a list lambda function.
extension/core_functions/aggregate/general/sum.cpp Example aggregate.

For the engine-level function infrastructure these plug into, see systems/function.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

core_functions extension – DuckDB wiki | Factory