ggml-org/llama.cpp
Development workflow
Active contributors: Georgi Gerganov, Johannes Gäßler, Sigbjørn Skjæret
This page distills the day-to-day flow of "I want to land a change in llama.cpp."
Branch
llama.cpp uses a single master branch. Fork the repo, create a feature branch, and push. There is no long-lived release branch — releases are tags cut from master (see .github/workflows/release.yml).
git clone https://github.com/<you>/llama.cpp
cd llama.cpp
git checkout -b feat/whateverBuild
The CMake build is the only supported path. The Makefile in the root is a thin wrapper that forwards to CMake.
cmake -B build -DLLAMA_BUILD_TESTS=ON
cmake --build build --config Release -jBackend-specific flags and presets live in ggml/CMakeLists.txt and CMakePresets.json. See Getting started for the common flags.
common/build-info.cpp.in is configured at build time to embed git commit info into binaries. If you regenerate the build directory, that file is refreshed.
Iterate
Most subsystems can be exercised with one of the in-tree tools:
| Subsystem you changed | Quick smoke test |
|---|---|
Tokenizer (src/llama-vocab.cpp) |
./build/bin/llama-tokenize -m model.gguf "your text" |
Sampler (src/llama-sampler.cpp) |
./build/bin/llama-cli -m model.gguf -p "..." --top-k 40 --temp 0.7 ... |
Grammar (src/llama-grammar.cpp) |
./build/bin/llama-cli ... --grammar-file grammars/json.gbnf |
Chat template (src/llama-chat.cpp or common/chat.cpp) |
./build/bin/llama-cli -cnv -m model.gguf |
GGUF reader/writer (ggml/src/gguf.cpp) |
./build/bin/llama-gguf-split, ./build/bin/llama-gguf |
Quantization (src/llama-quant.cpp) |
./build/bin/llama-quantize in.gguf out.gguf Q4_K_M |
Server endpoints (tools/server/) |
./build/bin/llama-server -m model.gguf then hit :8080 |
Backend op (ggml/src/ggml-*/) |
./build/bin/test-backend-ops |
Test
ctest --test-dir build --output-on-failureDetailed list in Testing. For long-form CI see ci/README.md.
Format
The repo enforces basic style with .clang-format, .editorconfig, and pre-commit. Install the hooks once:
pip install pre-commit
pre-commit installpre-commit run --all-files reproduces the CI check.
Commit and push
Squash logically. The maintainers will squash again on merge. Use the commit format described in CONTRIBUTING.md:
<module> : <short title> (#<issue_number>)Examples from real history: common : check for null getpwuid in hf-cache (#22550), ggml : fix bug in some quants (#xxxx).
Pull request
- Open against
master. - Fill in
.github/pull_request_template.md. - Search existing PRs and issues first.
- Limit yourself to one PR at a time if you are a new contributor.
- Allow maintainers to push to your branch when reasonable — it speeds up review.
After merge
- Watch for follow-up issues mentioning your change.
- If the change touched a
CODEOWNERSpath, expect to be looped in on related future PRs. - If you maintain a model architecture you added, consider adding yourself to
CODEOWNERS.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.
Previous
How to contribute
Next
Testing