Open-Source Wikis

/

DuckDB

/

Systems

/

Transaction

duckdb/duckdb

Transaction

Active contributors: Mytherin, Mark Raasveldt, taniabogatsch

Purpose

src/transaction/ implements ACID transactions with snapshot isolation under MVCC (multi-version concurrency control). Every transaction reads from a stable snapshot and produces undo entries for uncommitted writes; on commit, those writes become visible to new transactions and are emitted to the WAL.

Directory layout

src/transaction/
├── transaction.cpp                Base class for Transaction
├── transaction_context.cpp        Per-connection transaction state
├── transaction_manager.cpp        Abstract base
├── duck_transaction.cpp           Concrete DuckDB transaction
├── duck_transaction_manager.cpp   Concrete transaction manager
├── meta_transaction.cpp           Coordinates transactions across attached databases
├── undo_buffer.cpp                Per-transaction undo log
├── undo_buffer_allocator.cpp      Arena allocator for undo entries
├── commit_state.cpp               Apply commit: install new versions, write WAL
├── rollback_state.cpp             Apply rollback: restore old versions
├── cleanup_state.cpp              Garbage-collect undo entries no longer needed
└── wal_write_state.cpp            Build WAL records from undo entries

Key abstractions

Type File Role
Transaction src/include/duckdb/transaction/transaction.hpp Base class. The DuckDB-native subclass is DuckTransaction; storage extensions can provide their own.
DuckTransaction src/transaction/duck_transaction.cpp A single transaction, with a snapshot ID and an UndoBuffer.
TransactionManager src/include/duckdb/transaction/transaction_manager.hpp Issues snapshot/commit IDs, tracks active transactions, drives cleanup.
DuckTransactionManager src/transaction/duck_transaction_manager.cpp Default implementation.
MetaTransaction src/transaction/meta_transaction.cpp Coordinates multiple per-database transactions inside a multi-DB query.
UndoBuffer src/transaction/undo_buffer.cpp Per-transaction list of undo entries (CATALOG, INSERT, UPDATE, DELETE).
CommitState / RollbackState / CleanupState commit_state.cpp, rollback_state.cpp, cleanup_state.cpp Visitors over the undo buffer that apply the corresponding action per entry.
WALWriteState wal_write_state.cpp Builds WAL records from undo entries during commit.
TransactionContext transaction_context.cpp Per-ClientContext state: current transaction, auto-commit mode, savepoints.

How it works

sequenceDiagram
    participant C as ClientContext
    participant TM as DuckTransactionManager
    participant T as DuckTransaction
    participant LS as LocalStorage
    participant W as WAL
    C->>TM: BeginTransaction
    TM->>T: assign snapshot_id
    C->>T: read DataTable (sees only versions <= snapshot_id)
    C->>LS: write into local_storage (uncommitted)
    C->>T: append undo entries
    C->>TM: Commit
    TM->>T: assign commit_id
    T->>W: emit WAL records (CommitState + WALWriteState)
    T->>LS: install versions in DataTable + cleanup undo
    TM->>TM: GC older versions when no live snapshot needs them

Snapshots and IDs

Every DuckTransaction gets a monotonically-increasing snapshot_id at start. Reads look at versioned tables and accept rows whose creating transaction has either a lower commit_id (already committed) or is the current transaction itself.

On commit, the transaction is assigned a commit_id (also monotonic, drawn from the same sequence as snapshot_ids in newer versions). Updates and deletes leave version chains in the row data; the chain head holds the latest committed version, and older versions sit behind it until cleanup determines no live snapshot needs them.

Undo buffer

Every uncommitted change records an undo entry:

Entry kind Source Effect on rollback
CATALOG_ENTRY Catalog::CreateEntry etc. Drop the new entry, restore the old.
INSERT_TUPLE DataTable::Append Mark the row tombstoned.
UPDATE_TUPLE DataTable::Update Restore the previous version chain head.
DELETE_TUPLE DataTable::Delete Restore the row in the version chain.

UndoBuffer is allocated via undo_buffer_allocator.cpp (an arena) so that aborts pay no allocator overhead.

Commit path

CommitState walks the undo buffer and:

  1. Validates constraints / indexes against the live state.
  2. Writes WAL records via WALWriteState.
  3. Installs the new versions in DataTable and the catalog.
  4. Records the commit_id on each version.

If any step fails, the transaction aborts and RollbackState undoes anything already partially applied.

Cleanup

CleanupState is called periodically by DuckTransactionManager to garbage-collect committed versions that no live snapshot can see. It walks version chains and prunes obsolete versions, returning blocks to the buffer manager.

Multi-database (MetaTransaction)

DuckDB supports ATTACH to attach multiple databases (DuckDB or external via storage extensions) into one connection. A MetaTransaction coordinates per-database transactions so that they begin/commit/rollback together. Each attached database can have its own TransactionManager (e.g., a Postgres-storage extension would run a real Postgres transaction).

Isolation level

DuckDB offers snapshot isolation: a transaction sees a consistent snapshot of all data at start, and write conflicts (two transactions writing the same row) are detected on commit. There is no read uncommitted, repeatable read, or serializable mode beyond snapshot isolation.

Integration points

  • Storage: Versioned reads come from DataTable via LocalStorage and the version chains in row groups (src/storage/).
  • Catalog: Catalog mutations go through the same undo buffer as data writes (entries of kind CATALOG_ENTRY). See catalog.
  • WAL: Commit produces WAL records via wal_write_state.cpp. Replay on startup reapplies them. See storage.
  • Client: ClientContext::BeginTransaction, Commit, Rollback route through TransactionContext.

Entry points for modification

  • Adding a new undo entry kind: extend UndoFlags, add a Process* method in each of CommitState, RollbackState, CleanupState, and an emitter in the operator that produces the change.
  • Hooking a custom transaction into an attached database: implement a storage extension that supplies its own Transaction and TransactionManager subclasses; MetaTransaction will coordinate them.
  • Tuning when cleanup runs: see DuckTransactionManager::Checkpoint/CleanupState. The trigger condition is in duck_transaction_manager.cpp.

Key source files

File Purpose
src/transaction/duck_transaction_manager.cpp Snapshot/commit ID issuance, active-transaction tracking.
src/transaction/duck_transaction.cpp Per-transaction state and read-version checks.
src/transaction/undo_buffer.cpp The undo log.
src/transaction/commit_state.cpp Commit application.
src/transaction/rollback_state.cpp Rollback application.
src/transaction/cleanup_state.cpp Garbage collection.
src/transaction/wal_write_state.cpp WAL emission during commit.
src/transaction/meta_transaction.cpp Multi-DB coordination.

Continue to storage for the layer that the transaction system reads/writes against, or catalog for how DDL is versioned.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Transaction – DuckDB wiki | Factory