llvm/llvm-project
By the numbers
Data collected on 2026-04-30 from the main branch at commit 2fdb09cf65e6.
This page is a quantitative snapshot of the llvm/llvm-project monorepo. It covers size, activity, and complexity. Numbers are approximate where rough scans (e.g., find | wc -l, wc -l) were used.
Size
- Total source files across all top-level subprojects: roughly 186,000 files (everything tracked by git plus tests, includes, configs).
- Total git commits: 578,857 as of this snapshot.
- Tagged releases: 316 (LLVM tags from
llvmorg-1.0.0in 2003 throughllvmorg-20.xin 2025). - First commit: 2001-06-06 ("New repository initialized by cvs2svn.").
Source lines by subproject
Approximate non-test, non-unittest source lines (.c, .cpp, .cc, .h, .hpp, .inc):
| Subproject | Approx. source lines |
|---|---|
clang |
1,893,005 |
llvm |
873,010 |
lldb |
752,745 |
mlir |
744,573 |
polly |
390,638 |
flang |
334,658 |
compiler-rt |
238,030 |
clang-tools-extra |
208,635 |
libcxx |
207,808 |
openmp |
120,561 |
lld |
109,511 |
bolt |
99,376 |
offload |
38,586 |
flang-rt |
38,001 |
libc |
26,349 |
libunwind |
20,341 |
libclc |
18,422 |
libcxxabi |
15,066 |
orc-rt |
7,680 |
libsycl |
4,550 |
xychart-beta horizontal
title "Source lines by subproject (thousands)"
x-axis ["clang", "llvm", "lldb", "mlir", "polly", "flang", "compiler-rt", "clang-tools-extra", "libcxx", "openmp", "lld", "bolt", "offload", "flang-rt", "libc", "libunwind", "libclc", "libcxxabi", "orc-rt", "libsycl"]
y-axis "Lines (thousands)" 0 --> 1900
bar [1893, 873, 753, 745, 391, 335, 238, 209, 208, 121, 110, 99, 39, 38, 26, 20, 18, 15, 8, 5]The top three (Clang, LLVM core, LLDB) are each in seven figures. LLDB is the surprise — it ships a lot of platform-specific debug-info handling, plus DWARF/PDB parsers, plus a Python plugin layer.
File counts by subproject
| Subproject | Files (incl. tests) |
|---|---|
llvm |
78,731 |
clang |
33,584 |
libcxx |
12,296 |
lldb |
9,241 |
mlir |
7,247 |
libc |
6,931 |
flang |
5,431 |
compiler-rt |
4,716 |
lld |
4,137 |
clang-tools-extra |
3,912 |
polly |
2,925 |
bolt |
1,190 |
libclc |
1,018 |
openmp |
851 |
offload |
828 |
third-party |
752 |
cross-project-tests |
311 |
flang-rt |
254 |
libcxxabi |
162 |
utils |
133 |
orc-rt |
129 |
libunwind |
76 |
libsycl |
73 |
runtimes |
12 |
llvm-libgcc |
4 |
LLVM core has the most files because it carries every target backend — the llvm/lib/Target/ tree alone holds dozens of architectures with their own TableGen, instruction-info, and codegen passes.
Activity
Commits per year
xychart-beta
title "Commits per year (since 2001)"
x-axis ["'01", "'02", "'03", "'04", "'05", "'06", "'07", "'08", "'09", "'10", "'11", "'12", "'13", "'14", "'15", "'16", "'17", "'18", "'19", "'20", "'21", "'22", "'23", "'24", "'25"]
y-axis "Commits" 0 --> 45000
bar [1442, 3557, 4677, 6928, 5027, 7691, 9938, 12624, 23163, 23376, 20960, 20613, 24501, 24961, 29554, 32019, 28846, 28824, 33392, 35132, 32864, 37461, 37532, 37498, 41345]The repo has been on a long, slow upward ramp since 2001, with a step change in 2009 (commit volume roughly doubled when many tools migrated under the LLVM umbrella) and another in 2015 (when the project's contributor base broadened significantly). 2025 set a new annual record at 41,345 commits — the project has not shown any sign of slowing.
Recent activity
- Commits in the last 90 days: 11,414.
- Unique authors in the last 90 days: 1,412.
- Unique authors in the last 365 days: 2,578.
- Total unique authors all time: 7,200.
Churn hotspots (last 90 days)
The areas with the most touched files in the trailing 90 days:
| Directory | Files touched (90d) |
|---|---|
llvm/test |
15,717 |
llvm/lib |
8,150 |
clang/test |
6,578 |
clang/lib |
3,867 |
lldb/source |
1,981 |
libclc/clc |
1,912 |
mlir/test |
1,838 |
mlir/lib |
1,540 |
libcxx/test |
1,504 |
libc/src |
1,482 |
llvm/include |
1,399 |
clang/include |
1,384 |
lldb/test |
1,217 |
flang/test |
1,163 |
libclc/opencl |
867 |
Tests drive the top of the list because every behavior change ships with regression coverage — that ratio is one of the project's strongest cultural signals.
Bot-attributed commits
Out of 578,857 total commits, 8,018 (≈ 1.4%) carry an email matching [bot]@, noreply.github.com, dependabot, or github-actions — heuristically, bot or automation activity. The last 90 days alone show 0 explicit Co-authored-by: trailers from bots. This is a lower bound; inline AI tools (Copilot, etc.) leave no trace in commit metadata, and the project's strict commit-message style minimizes machine-generated trailers. Treat this number as an indicator of automation pipelines, not of AI-assisted authorship.
Complexity
- Targets in
llvm/lib/Target/: dozens (X86, AArch64, ARM, RISCV, AMDGPU, NVPTX, WebAssembly, PowerPC, Mips, Hexagon, SPARC, SystemZ, BPF, AVR, Lanai, MSP430, Sparc, VE, XCore, M68k, LoongArch, CSKY, DirectX, SPIRV, Xtensa, ARC, plus several test-only targets). - Subprojects directly under the repo root: 24 (counting
bolt,clang,clang-tools-extra,cross-project-tests,compiler-rt,flang,flang-rt,libc,libclc,libcxx,libcxxabi,libsycl,libunwind,lld,lldb,llvm,llvm-libgcc,mlir,offload,openmp,orc-rt,polly,runtimes,third-party,utils). - Top-level lib/ directories in LLVM core: 56 (Analysis, AsmParser, BinaryFormat, Bitcode, CAS, CodeGen, DebugInfo, ExecutionEngine, IR, MC, Object, Support, TableGen, Target, Transforms, plus many more — see
llvm/lib/).
Reading these numbers
These are scope indicators, not quality metrics. LLDB and Polly look large mostly because they carry duplicated tables (LLDB ships per-platform debug-info parsers; Polly imports parts of the Integer Set Library). Clang's line count includes generated headers from TableGen. The interesting numbers are the activity ones — they show a project with a wide and steady contributor base, sustained release cadence, and a healthy test-to-code ratio.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.
Previous
Glossary
Next
Lore