ggml-org/llama.cpp

Dependencies

llama.cpp is intentionally lean. The CONTRIBUTING guide tells contributors to "avoid adding third-party dependencies." Most "dependencies" are vendored single-header libraries under vendor/; only a handful are real, system-level packages, and they are all gated by build flags.

System (build-time, optional)

Dependency	Required when	Notes
CMake ≥ 3.14	Always	Build system
C++17 compiler	Always	gcc / clang / MSVC / Apple clang
Python ≥ 3.9	Building Python tooling	`convert_*.py`, `gguf-py`
CUDA Toolkit	`-DGGML_CUDA=ON`	NVIDIA backend
HIP / ROCm	`-DGGML_HIP=ON`	AMD backend
Metal SDK	macOS by default	Apple backend
Vulkan SDK	`-DGGML_VULKAN=ON`	`glslc` for shader compilation
Intel oneAPI / DPC++	`-DGGML_SYCL=ON`	SYCL backend
MUSA SDK	`-DGGML_MUSA=ON`	Moore Threads
OpenCL	`-DGGML_OPENCL=ON`	Adreno path
Qualcomm Hexagon SDK	`-DGGML_HEXAGON=ON`	DSP path
Huawei CANN	`-DGGML_CANN=ON`	Ascend NPU
OpenVINO Runtime	`-DGGML_OPENVINO=ON`	Intel CPU/iGPU/NPU
WebGPU runtime (Dawn / wgpu-native)	`-DGGML_WEBGPU=ON`	WebGPU
OpenMP	`-DGGML_OPENMP=ON`	Optional CPU threading
BLAS (OpenBLAS / MKL / Accelerate)	`-DGGML_BLAS=ON`	Prompt-processing matmul
libcurl	`-DLLAMA_CURL=ON`	`-hf <repo>` downloads
OpenSSL	`-DLLAMA_SERVER_SSL=ON`	HTTPS server
zDNN library	`-DGGML_ZDNN=ON`	IBM Z
ZenDNN library	`-DGGML_ZENDNN=ON`	AMD Zen CPU

Without any of these, you still get a fully working CPU build.

Vendored (`vendor/`)

These ship in-tree as single-header or minimal-source libraries. None require external installation.

Library	Purpose
`vendor/nlohmann/json.hpp`	JSON parsing/printing for the server, autoparser, and tooling
`vendor/cpp-httplib/httplib.h`	Single-header HTTP server used by `tools/server` and `tools/rpc`
`vendor/minja/minja.hpp`	Jinja2-compatible template engine used by `common/chat.cpp`
`vendor/stb/stb_image.h`	Image loading for `tools/mtmd`
`vendor/miniaudio/miniaudio.h`	Audio I/O for `tools/tts` and `tools/mtmd-audio`
`vendor/llguidance/`	Optional Rust grammar engine (build-gated by `LLAMA_LLGUIDANCE`)
`vendor/cpp-jsonschema/`	JSON Schema validation helpers

Third-party licenses are recorded in licenses/.

Python (`requirements*.txt`, `pyproject.toml`)

Python tooling lives in gguf-py/ and the top-level convert_*.py scripts. Requirements are split into focused files under requirements/ and aggregated by requirements.txt at the repo root. gguf-py/pyproject.toml is poetry-managed.

Top-level packages (subset):

numpy — tensor math during conversion.
torch — used by some conversion paths to load HF checkpoints.
safetensors — read modern HF checkpoints.
sentencepiece, tokenizers — reference tokenizers used during conversion / vocab generation.
protobuf — sentencepiece model files.
huggingface_hub — used by some conversion helpers.

For development:

pytest (server tests, gguf-py tests).
flake8, mypy, pyright, ty — linters/type checkers.
pre-commit — local hook runner.

Build artifacts

A standard Release build produces (with default flags):

build/bin/llama-cli
build/bin/llama-server
build/bin/llama-quantize
build/bin/llama-bench
build/bin/llama-imatrix
build/bin/llama-perplexity
build/bin/llama-tokenize
build/bin/llama-mtmd-cli
build/bin/llama-gguf-split
build/bin/test-backend-ops (with -DLLAMA_BUILD_TESTS=ON)
... plus example binaries and any backend plugins built as shared libraries

make install (CMake) installs libllama, libggml, headers, and the package config under ${CMAKE_INSTALL_PREFIX}.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

System (build-time, optional)

Vendored (vendor/)

Python (requirements*.txt, pyproject.toml)

Build artifacts

Vendored (`vendor/`)

Python (`requirements*.txt`, `pyproject.toml`)