duckdb/duckdb
Transaction
Active contributors: Mytherin, Mark Raasveldt, taniabogatsch
Purpose
src/transaction/ implements ACID transactions with snapshot isolation under MVCC (multi-version concurrency control). Every transaction reads from a stable snapshot and produces undo entries for uncommitted writes; on commit, those writes become visible to new transactions and are emitted to the WAL.
Directory layout
src/transaction/
├── transaction.cpp Base class for Transaction
├── transaction_context.cpp Per-connection transaction state
├── transaction_manager.cpp Abstract base
├── duck_transaction.cpp Concrete DuckDB transaction
├── duck_transaction_manager.cpp Concrete transaction manager
├── meta_transaction.cpp Coordinates transactions across attached databases
├── undo_buffer.cpp Per-transaction undo log
├── undo_buffer_allocator.cpp Arena allocator for undo entries
├── commit_state.cpp Apply commit: install new versions, write WAL
├── rollback_state.cpp Apply rollback: restore old versions
├── cleanup_state.cpp Garbage-collect undo entries no longer needed
└── wal_write_state.cpp Build WAL records from undo entriesKey abstractions
| Type | File | Role |
|---|---|---|
Transaction |
src/include/duckdb/transaction/transaction.hpp |
Base class. The DuckDB-native subclass is DuckTransaction; storage extensions can provide their own. |
DuckTransaction |
src/transaction/duck_transaction.cpp |
A single transaction, with a snapshot ID and an UndoBuffer. |
TransactionManager |
src/include/duckdb/transaction/transaction_manager.hpp |
Issues snapshot/commit IDs, tracks active transactions, drives cleanup. |
DuckTransactionManager |
src/transaction/duck_transaction_manager.cpp |
Default implementation. |
MetaTransaction |
src/transaction/meta_transaction.cpp |
Coordinates multiple per-database transactions inside a multi-DB query. |
UndoBuffer |
src/transaction/undo_buffer.cpp |
Per-transaction list of undo entries (CATALOG, INSERT, UPDATE, DELETE). |
CommitState / RollbackState / CleanupState |
commit_state.cpp, rollback_state.cpp, cleanup_state.cpp |
Visitors over the undo buffer that apply the corresponding action per entry. |
WALWriteState |
wal_write_state.cpp |
Builds WAL records from undo entries during commit. |
TransactionContext |
transaction_context.cpp |
Per-ClientContext state: current transaction, auto-commit mode, savepoints. |
How it works
sequenceDiagram
participant C as ClientContext
participant TM as DuckTransactionManager
participant T as DuckTransaction
participant LS as LocalStorage
participant W as WAL
C->>TM: BeginTransaction
TM->>T: assign snapshot_id
C->>T: read DataTable (sees only versions <= snapshot_id)
C->>LS: write into local_storage (uncommitted)
C->>T: append undo entries
C->>TM: Commit
TM->>T: assign commit_id
T->>W: emit WAL records (CommitState + WALWriteState)
T->>LS: install versions in DataTable + cleanup undo
TM->>TM: GC older versions when no live snapshot needs themSnapshots and IDs
Every DuckTransaction gets a monotonically-increasing snapshot_id at start. Reads look at versioned tables and accept rows whose creating transaction has either a lower commit_id (already committed) or is the current transaction itself.
On commit, the transaction is assigned a commit_id (also monotonic, drawn from the same sequence as snapshot_ids in newer versions). Updates and deletes leave version chains in the row data; the chain head holds the latest committed version, and older versions sit behind it until cleanup determines no live snapshot needs them.
Undo buffer
Every uncommitted change records an undo entry:
| Entry kind | Source | Effect on rollback |
|---|---|---|
CATALOG_ENTRY |
Catalog::CreateEntry etc. |
Drop the new entry, restore the old. |
INSERT_TUPLE |
DataTable::Append |
Mark the row tombstoned. |
UPDATE_TUPLE |
DataTable::Update |
Restore the previous version chain head. |
DELETE_TUPLE |
DataTable::Delete |
Restore the row in the version chain. |
UndoBuffer is allocated via undo_buffer_allocator.cpp (an arena) so that aborts pay no allocator overhead.
Commit path
CommitState walks the undo buffer and:
- Validates constraints / indexes against the live state.
- Writes WAL records via
WALWriteState. - Installs the new versions in
DataTableand the catalog. - Records the commit_id on each version.
If any step fails, the transaction aborts and RollbackState undoes anything already partially applied.
Cleanup
CleanupState is called periodically by DuckTransactionManager to garbage-collect committed versions that no live snapshot can see. It walks version chains and prunes obsolete versions, returning blocks to the buffer manager.
Multi-database (MetaTransaction)
DuckDB supports ATTACH to attach multiple databases (DuckDB or external via storage extensions) into one connection. A MetaTransaction coordinates per-database transactions so that they begin/commit/rollback together. Each attached database can have its own TransactionManager (e.g., a Postgres-storage extension would run a real Postgres transaction).
Isolation level
DuckDB offers snapshot isolation: a transaction sees a consistent snapshot of all data at start, and write conflicts (two transactions writing the same row) are detected on commit. There is no read uncommitted, repeatable read, or serializable mode beyond snapshot isolation.
Integration points
- Storage: Versioned reads come from
DataTableviaLocalStorageand the version chains in row groups (src/storage/). - Catalog: Catalog mutations go through the same undo buffer as data writes (entries of kind
CATALOG_ENTRY). See catalog. - WAL: Commit produces WAL records via
wal_write_state.cpp. Replay on startup reapplies them. See storage. - Client:
ClientContext::BeginTransaction,Commit,Rollbackroute throughTransactionContext.
Entry points for modification
- Adding a new undo entry kind: extend
UndoFlags, add aProcess*method in each ofCommitState,RollbackState,CleanupState, and an emitter in the operator that produces the change. - Hooking a custom transaction into an attached database: implement a storage extension that supplies its own
TransactionandTransactionManagersubclasses;MetaTransactionwill coordinate them. - Tuning when cleanup runs: see
DuckTransactionManager::Checkpoint/CleanupState. The trigger condition is induck_transaction_manager.cpp.
Key source files
| File | Purpose |
|---|---|
src/transaction/duck_transaction_manager.cpp |
Snapshot/commit ID issuance, active-transaction tracking. |
src/transaction/duck_transaction.cpp |
Per-transaction state and read-version checks. |
src/transaction/undo_buffer.cpp |
The undo log. |
src/transaction/commit_state.cpp |
Commit application. |
src/transaction/rollback_state.cpp |
Rollback application. |
src/transaction/cleanup_state.cpp |
Garbage collection. |
src/transaction/wal_write_state.cpp |
WAL emission during commit. |
src/transaction/meta_transaction.cpp |
Multi-DB coordination. |
Continue to storage for the layer that the transaction system reads/writes against, or catalog for how DDL is versioned.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.