Factory.ai

Open-Source Wikis

/

llama.cpp

/

By the numbers

ggml-org/llama.cpp

By the numbers

Data collected on 2026-04-30 from master at commit beb42fffa.

Size

Approximate non-blank source line counts, computed by wc -l on tracked files (excluding .git/ and vendor/). The codebase is overwhelmingly C++ on top of plain C in ggml/, with a substantial Python conversion stack.

xychart-beta horizontal
    title "Lines of code by language"
    x-axis ["C++ (.cpp)", "Python (.py)", "C (.c)", "C/C++ headers", "CUDA (.cu)", "C++ headers (.hpp)", "Metal (.metal)"]
    y-axis "Lines" 0 --> 350000
    bar [319369, 53402, 52509, 49937, 21365, 16217, 10627]
Language Lines
C++ (.cpp) 319,369
Python (.py) 53,402
C (.c) 52,509
C/C++ headers (.h) 49,937
CUDA (.cu) 21,365
C++ headers (.hpp) 16,217
Metal (.metal) 10,627

The five largest individual source files give a sense of where complexity concentrates.

File Bytes
convert_hf_to_gguf.py 651,450
src/llama-model.cpp 546,186
ggml/src/ggml.c 247,981
ggml/src/ggml-quants.c 222,579
common/arg.cpp 188,905

Activity

The repo has moved at a steady, high pace since the very first commit on 2023-03-10 ("Initial release"). Selected snapshot:

Metric Value
Total commits on master 8,991
Unique authors (all-time) 1,600
Commits in the last 90 days 1,096
Daily commit count, last 30 days typically 8–19

Top commit-count contributors (all-time, derived from git log --pretty=%an):

Author Commits
Georgi Gerganov 1,731
Johannes Gäßler 370
Xuan-Son Nguyen 302
Jeff Bolz 270
Sigbjørn Skjæret 266
Daniel Bevenius 253
slaren 214
Diego Devesa 141

These are commit counts only — no opinion implied about who is "best". For ownership see Maintainers.

Bot-attributed commits

A grep over the last 90 days of git history for [bot] co-author trailers and bot author names finds 0 bot-attributed commits. This is consistent with the project's AI policy, which forbids fully AI-generated submissions and AI-written PR descriptions or commit messages. Inline AI assistance is permitted but leaves no trace in git history, so this number is a strict lower bound on AI-assisted work.

Complexity

A few code volume signals worth noting (all sizes are wc -l on tracked files):

Subsystem Notable concentration
src/llama-model.cpp ~10.7k LOC — single file containing tensor allocation for every supported text architecture
src/models/ ~70 per-architecture graph builders, one file per LLM family
ggml/src/ggml.c Core CPU compute kernels and graph machinery; ~22k LOC
ggml/src/ggml-quants.c Reference quantization kernels for every ggml_type; ~13k LOC
ggml/src/ggml-cuda/ Largest backend by file count; per-op kernels in dozens of files
tools/server/server-context.cpp ~5k LOC scheduler + slot loop driving the HTTP server
convert_hf_to_gguf.py ~16k LOC — a switch over every supported HuggingFace model class

Test surface

Tests live under tests/. They run via ctest and cover:

  • Tokenizer round-trips for BPE, SPM, WPM, UGM, RWKV
  • GGUF reader/writer
  • Sampling chains and grammar
  • Chat parser, PEG parser, autoparser
  • A backend-ops conformance suite (tests/test-backend-ops.cpp) that compares each backend's kernel implementation against the CPU reference for every ggml op

See Testing for how to run them.

Dependencies

llama.cpp prides itself on minimal dependencies. Headers under vendor/ are vendored single-file libraries (stb_image.h, nlohmann/json.hpp, httplib.h, minja.hpp, etc.). The only optional system dependencies are CMake-driven and gated by GGML_* and LLAMA_* flags — for instance LLAMA_CURL for HuggingFace downloads, GGML_BLAS for system BLAS, GGML_CUDA/GGML_HIP/GGML_METAL for GPU backends. See Reference → Dependencies.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

By the numbers – llama.cpp wiki | Factory