Open-Source Wikis

/

DuckDB

/

Systems

/

Catalog

duckdb/duckdb

Catalog

Active contributors: Tishj, Mytherin, Mark Raasveldt

Purpose

src/catalog/ is the metadata layer. It holds schemas, tables, views, sequences, indexes, types, and functions, with full transactional semantics: catalog mutations participate in MVCC just like data mutations. Every name resolution in planner and every DDL operator in execution goes through this layer.

Directory layout

src/catalog/
├── catalog.cpp                  Top-level Catalog facade
├── duck_catalog.cpp             DuckDB-native Catalog implementation
├── catalog_entry.cpp            Base for all catalog entries
├── catalog_entry_retriever.cpp  Path resolution + dependency tracking on lookup
├── catalog_search_path.cpp      Search-path semantics (database.schema.entry)
├── catalog_set.cpp              Versioned hash map of CatalogEntry chains
├── catalog_transaction.cpp      Per-transaction view of the catalog
├── dependency_manager.cpp       Tracks catalog object dependencies
├── dependency_list.cpp          Per-entry dependency list
├── dependency_catalog_set.cpp   Hash map for dependencies
├── entry_lookup_info.cpp        Resolution-context payload
├── similar_catalog_entry.cpp    "Did you mean ..." helper
├── catalog_entry/               Concrete CatalogEntry subclasses
└── default/                     Default schemas, types, functions registered at startup

Key abstractions

Type File Role
Catalog src/include/duckdb/catalog/catalog.hpp Per-database facade. The DuckDB implementation is DuckCatalog; storage extensions can plug in their own (e.g., PostgresCatalog).
DuckCatalog src/catalog/duck_catalog.cpp DuckDB-native implementation: holds the schemas catalog set and the dependency manager.
CatalogEntry src/include/duckdb/catalog/catalog_entry.hpp Versioned entry. Subclasses include SchemaCatalogEntry, TableCatalogEntry, ViewCatalogEntry, SequenceCatalogEntry, IndexCatalogEntry, TypeCatalogEntry, ScalarFunctionCatalogEntry, AggregateFunctionCatalogEntry, TableFunctionCatalogEntry, PragmaFunctionCatalogEntry, MacroCatalogEntry.
CatalogSet src/catalog/catalog_set.cpp A versioned hash map of CatalogEntry chains. Each name maps to the latest version, with older versions visible to older snapshots.
CatalogSearchPath src/catalog/catalog_search_path.cpp Implements SET search_path = ...; resolves bare names by trying each schema in the path.
CatalogEntryRetriever src/catalog/catalog_entry_retriever.cpp The mid-layer that all binders use to look up entries — performs path resolution, error formatting, and dependency tracking.
DependencyManager src/catalog/dependency_manager.cpp Records which catalog objects depend on which. Used to enforce DROP semantics (CASCADE / RESTRICT).
CatalogTransaction src/catalog/catalog_transaction.cpp Per-transaction projection of the catalog: filters out entries the transaction cannot see.

How it works

graph TD
    DDL[CREATE TABLE / CREATE VIEW / ...] -->|via Binder + executor| Catalog[Catalog::CreateEntry]
    Catalog -->|insert version| Set[CatalogSet]
    Catalog -->|record undo| Undo[UndoBuffer]
    Lookup[Binder lookup] -->|CatalogEntryRetriever| Retriever
    Retriever -->|resolve search path| Search[CatalogSearchPath]
    Retriever --> Catalog
    Catalog --> Set
    Set --> Entry[CatalogEntry chain head]

Versioning

Every mutation produces a new CatalogEntry linked to the previous head via a version pointer (similar to row versioning in DataTable). The transaction sees the highest-commit-id version that is <= snapshot_id. On commit, the new entry's commit_id is set; on rollback, the new entry is removed and the previous head is restored.

This makes CREATE TABLE, ALTER TABLE, and DROP TABLE fully ACID and undoable, just like row writes.

Dependency tracking

A view depends on the tables it references. A function depends on the types it uses. The DependencyManager records these edges as the catalog is mutated. DROP ... CASCADE walks the dependency graph and drops dependents; DROP ... RESTRICT errors if dependents exist.

When a catalog entry is loaded for binding, CatalogEntryRetriever records a dependency from the binder's parent object (e.g., the view being created) so that the new object's dependency list is complete.

Search path

DuckDB supports SET search_path = 'a, main'. CatalogSearchPath implements this: bare names are tried against each schema in order. Cross-database lookups use database.schema.name syntax; the database resolves through DatabaseManager (src/main/database_manager.cpp).

Default entries

src/catalog/default/ registers the standard schemas (main, system, temp, pg_catalog for compatibility) and types/functions at database startup. The default_*.cpp files have lazy loaders so that the catalog only materializes built-ins on demand.

Storage extensions

Catalog is abstract enough that a storage extension (e.g., the postgres-scanner) can supply its own catalog backed by a remote system. Catalog::Attach calls into the relevant StorageExtension to construct that catalog. MetaTransaction (transaction) coordinates writes across attached catalogs.

Integration points

  • Binder: Every name resolution flows through CatalogEntryRetriever (used by Binder::BindXxx).
  • Execution: DDL operators (PhysicalCreateTable, PhysicalAlter, PhysicalDrop, etc., in src/execution/operator/schema/) call into Catalog.
  • Transactions: Catalog mutations write CATALOG_ENTRY undo entries (see transaction). On commit, they emit WAL records.
  • Storage: Persisted catalog metadata lives in metadata blocks (src/storage/metadata/).
  • Functions: Built-in functions are registered through this layer at startup (see function).

Entry points for modification

  • Adding a new catalog object kind: subclass CatalogEntry, place it in src/catalog/catalog_entry/, register it in Catalog::CreateEntry/DropEntry/CreateXxx, and add serialization in src/storage/serialization/.
  • Adding a default schema or built-in: see src/catalog/default/ for the lazy-load pattern.
  • Customizing name resolution (e.g., a new schema-like construct): edit CatalogSearchPath and CatalogEntryRetriever.
  • Implementing a storage extension's catalog: subclass Catalog, supply GetEntry, CreateEntry, etc., and register a StorageExtension (src/main/extension.cpp).

Key source files

File Purpose
src/catalog/catalog.cpp Catalog facade and helpers.
src/catalog/duck_catalog.cpp DuckDB-native catalog implementation.
src/catalog/catalog_set.cpp The versioned hash map.
src/catalog/catalog_entry_retriever.cpp Used by every name lookup.
src/catalog/catalog_search_path.cpp SET search_path semantics.
src/catalog/dependency_manager.cpp Object dependency graph.
src/catalog/catalog_entry/ Concrete entry subclasses.
src/catalog/default/ Built-in schemas/types/functions.

Continue to function for how scalar/aggregate/table functions are registered into the catalog.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Catalog – DuckDB wiki | Factory