ggml-org/llama.cpp
Tooling
Active contributors: Georgi Gerganov, Daniel Bevenius
The build, lint, and codegen tools shipped with llama.cpp.
Build
- CMake is the only supported build system. Top-level
CMakeLists.txt, with sub-projects inggml/CMakeLists.txt,src/CMakeLists.txt,common/CMakeLists.txt,tools/<tool>/CMakeLists.txt,tests/CMakeLists.txt, andexamples/<example>/CMakeLists.txt. - CMakePresets in
CMakePresets.jsondefine canned configurations (Debug/Release × backend combinations). - Makefile at the repo root is a convenience wrapper that calls into CMake. The legacy hand-rolled Makefile from early in the project is gone.
cmake/holds the project's installable CMake config (llama-config.cmake.in,llama.pc.in) for downstream consumers usingfind_package(llama).flake.nixprovides Nix users a reproducible build env.
Linting and formatting
| Tool | Config | What it checks |
|---|---|---|
clang-format |
.clang-format |
C/C++ style; brackets, alignment, indentation |
clang-tidy |
.clang-tidy |
Static analysis subset that maintainers care about |
editorconfig-checker |
.editorconfig, .ecrc |
Whitespace/trailing rules |
flake8 |
.flake8 |
Python style (mostly conversion scripts) |
mypy |
mypy.ini |
Python types in gguf-py/ |
pyright |
pyrightconfig.json |
Stricter Python type checks |
ty |
ty.toml |
Type checker config (used in CI) |
pre-commit |
.pre-commit-config.yaml |
Local + CI runner that wires the above into git hooks |
Install once:
pip install pre-commit
pre-commit install
pre-commit run --all-filesCode generation
convert_hf_to_gguf_update.py— regenerates lookup tables (mostly tokenizer pre-tokenizer hashes) used byconvert_hf_to_gguf.py. Run it when adding a new tokenizer.examples/gen-docs/— generatesdocs/ops.mdfrom the liveggml_openum so the op coverage table stays accurate.scripts/gen-*— assorted small generators (gen-authors.sh,gen-build-info.sh, etc.).scripts/sync-ggml*— bidirectional sync with the standaloneggml-org/ggmlrepository.common/build-info.cpp.in— configured at build time to embed the git commit and build flags.
Web UI build
tools/server/webui/ is a separate JavaScript project (npm-based) bundled into the llama-server binary at build time. See its own package.json and the relevant CI workflow for how it's tested. The maintainers responsible for the WebUI are listed under ggml-org/llama-webui in CODEOWNERS.
CI
.github/workflows/build.yml— matrix build on Linux/macOS/Windows × multiple backends..github/workflows/server.yml— server-specific test pipeline..github/workflows/release.yml— produces the binaries attached to GitHub releases..github/workflows/python-*— runsflake8,mypy,pyrighton the Python code..github/workflows/docker.yml— builds and pushes the images defined under.devops/.ci/run.sh— entry point used by the self-hostedggml-cirunners (long-form, multi-backend, multi-GPU).
The .github/actions/ directory contains reusable composite actions used by the workflows.
Profiling helpers
examples/eval-callback/— register a callback after every tensor eval.examples/llama-bench/(nowtools/llama-bench) — throughput benchmarking.examples/gguf-hash/— verify a GGUF file's tensor data hasn't changed.pocs/vdot/— proof-of-concept dot-product microbenchmarks.
Docker
.devops/ holds Dockerfiles, one per backend variant: cpu.Dockerfile, cuda.Dockerfile, vulkan.Dockerfile, intel.Dockerfile, rocm.Dockerfile, musa.Dockerfile, plus a Nix-based image. See docs/docker.md for usage.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.
Previous
Patterns and conventions
Next
Systems