Factory.ai

Open-Source Wikis

/

LLVM

/

Subprojects

/

llvm

llvm/llvm-project

llvm

The llvm/ subproject is the LLVM core: the intermediate representation, the optimizer, the code generator, every target backend, and the support libraries that the rest of the monorepo depends on. Almost every other subproject in this repository links against llvm.

Purpose

llvm provides a complete static-compilation pipeline at the IR level: parse / build / verify LLVM IR, run analyses and transforms over it, lower it through machine IR to a target ISA, and emit assembly or object code. It also provides a JIT framework (ORC), a link-time optimizer (LTO), and the support layer (data structures, command-line, file I/O, debug info) that the rest of LLVM uses.

Directory layout

llvm/
├── include/llvm/         # Public C++ headers
├── lib/                  # Implementation, mirrored to include/
├── tools/                # Drivers: opt, llc, llvm-mc, llvm-objdump, ...
├── utils/                # llvm-tblgen, lit, FileCheck, update_*_test_checks.py, ...
├── docs/                 # Sphinx documentation source
├── examples/             # Standalone examples (Kaleidoscope tutorial, JIT examples)
├── test/                 # lit regression tests
├── unittests/            # Google-Test unit tests
├── cmake/                # CMake helper modules
├── benchmarks/           # Microbenchmarks
├── Maintainers.md        # Area maintainers
└── CMakeLists.txt

Inside llvm/lib/ the directories that matter most:

Directory What lives there
IR/ LLVM IR types, instructions, modules, the verifier
Analysis/ IR analyses: dominators, alias analysis, scalar evolution, value tracking, target transform info
Transforms/ IR optimization passes (Scalar/, Vectorize/, IPO/, InstCombine/, Coroutines/, Instrumentation/, ObjCARC/)
Passes/ Pass registry and PassBuilder — the entry point for the new pass manager
CodeGen/ Target-independent codegen (SelectionDAG, GlobalISel, register allocation, scheduling)
MC/ Machine-code layer: assembler, disassembler, object-file emission
Target/<Name>/ Per-target backends
Object/, BinaryFormat/, ObjectYAML/, ObjCopy/ Object-file formats
Bitcode/, Bitstream/ Serialized IR
LTO/, DTLTO/ Link-time optimization
ExecutionEngine/ JITs: MCJIT, ORC, ORCv2
DebugInfo/, DWARFCFIChecker/, DWARFLinker/, DWP/, Debuginfod/ Debug info
Support/, ADT/ (in include/), TargetParser/ Foundational utilities
TableGen/ TableGen interpreter; utils/TableGen/ houses the backends
Demangle/ Symbol demangling
Remarks/ Optimization-remark serialization
XRay/, ProfileData/, CGData/ Profile and instrumentation support
MCA/ Machine-Code Analyzer (CPU-pipeline simulator)
SandboxIR/ Experimental sandboxed-IR experiments

Key abstractions

Type File Role
llvm::Module llvm/include/llvm/IR/Module.h Compilation unit at IR level
llvm::Function llvm/include/llvm/IR/Function.h An IR function
llvm::Instruction llvm/include/llvm/IR/Instruction.h Single SSA instruction
llvm::PassManager<> llvm/include/llvm/IR/PassManager.h The new pass manager
llvm::PassBuilder llvm/include/llvm/Passes/PassBuilder.h Pipeline construction
llvm::TargetMachine llvm/include/llvm/Target/TargetMachine.h Target factory
llvm::MachineFunction llvm/include/llvm/CodeGen/MachineFunction.h Codegen-side IR (post-isel)
llvm::MCStreamer / llvm::MCContext llvm/include/llvm/MC/ Object-file emission
llvm::SelectionDAG llvm/include/llvm/CodeGen/SelectionDAG/ DAG-based ISel
llvm::GISelInstSelector llvm/include/llvm/CodeGen/GlobalISel/ GlobalISel ISel
llvm::raw_ostream llvm/include/llvm/Support/raw_ostream.h Streaming output (replaces std::ostream)
llvm::Error / llvm::Expected<T> llvm/include/llvm/Support/Error.h Typed error handling
ADT containers llvm/include/llvm/ADT/ SmallVector, DenseMap, StringRef, Twine, etc.

How it works

graph TD
    src["Front-end IR<br/>(.bc / .ll / in-memory)"] --> mod[llvm::Module]
    mod --> verify[Verifier]
    verify --> ir_passes["IR pass pipeline<br/>(InstCombine, GVN, Inliner, ...)"]
    ir_passes --> codegen[CodeGen / TargetMachine]
    codegen --> isel["Instruction selection<br/>(SelectionDAG or GlobalISel)"]
    isel --> mi[MachineFunction / MachineInstr]
    mi --> regalloc[Register allocation]
    regalloc --> sched[Scheduling]
    sched --> mc[MC layer]
    mc --> obj[".o / .s output"]

A typical static compilation:

  1. The front end (Clang, Flang, ...) produces a Module.
  2. The PassBuilder constructs a pass pipeline (e.g., default<O2>) and runs it.
  3. The TargetMachine for the requested triple produces a CodeGenPassManager (legacy PM still) that carries the IR through codegen.
  4. SelectionDAG or GlobalISel converts IR into MachineInstrs.
  5. Register allocation (greedy / fast / pbqp / basic), instruction scheduling, prolog/epilog insertion.
  6. MC streamer emits assembly or object bytes.

opt (llvm/tools/opt/), llc (llvm/tools/llc/), and llvm-mc (llvm/tools/llvm-mc/) are the canonical drivers exposing the IR pipeline, codegen pipeline, and MC layer respectively.

Targets

llvm/lib/Target/ holds dozens of architectures. Each follows the same internal layout: <Name>ISelLowering, <Name>InstrInfo, <Name>RegisterInfo, <Name>Subtarget, <Name>FrameLowering, <Name>TargetMachine, <Name>AsmPrinter, plus *.td files describing the ISA. To add a new target, the LLVM "Writing an LLVM backend" tutorial is the starting point, and existing minimal targets (RISCV early commits, BPF) make decent reading material.

Tools

The llvm/tools/ directory holds dozens of executables: opt, llc, llvm-mc, llvm-objdump, llvm-readobj, llvm-nm, llvm-symbolizer, llvm-dwarfdump, llvm-pdbutil, llvm-profdata, llvm-cov, llvm-cxxfilt, llvm-bcanalyzer, llvm-link, llvm-extract, llvm-as, llvm-dis, bugpoint, llvm-reduce, llvm-mca, and many more.

Integration points

Entry points for modification

  • Adding a transform pass: write a PassInfoMixin<>-based class in llvm/lib/Transforms/, register it in llvm/lib/Passes/PassRegistry.def, and add an IR test under llvm/test/Transforms/.
  • Adding a codegen pass: add to llvm/lib/CodeGen/ and register in llvm/lib/CodeGen/CodeGen.cpp.
  • Adding a backend feature: most changes live in llvm/lib/Target/<Name>/. The .td files declare the ISA; the .cpp files implement the C++ glue.
  • Touching the IR: changes to Instruction.h, Constants.h, the Verifier, or bitcode formats require corresponding test updates and a release-note entry.

Reference

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

llvm – LLVM wiki | Factory