Factory.ai

Open-Source Wikis

/

Stable Diffusion WebUI

/

Stable Diffusion web UI

/

Architecture

AUTOMATIC1111/stable-diffusion-webui

Architecture

This page explains how the program starts, where the main subsystems live, and how a single image-generation request flows through the codebase.

High-level shape

graph TD
    subgraph Boot
        Launch[launch.py] --> LU[modules/launch_utils.py<br/>prepare_environment]
        LU -->|venv + pip install| Webui[webui.py]
    end

    Webui -->|webui()| UI[modules/ui.py<br/>create_ui]
    Webui -->|api_only()| FAPI[FastAPI app]
    UI --> Gradio[Gradio Blocks]
    Gradio -->|HTTP| FAPI
    FAPI -->|Api class| APIRoutes[modules/api/api.py]

    UI -->|user clicks Generate| Pipe[modules/processing.py<br/>process_images]
    APIRoutes -->|POST /sdapi/v1/txt2img| Pipe

    Pipe --> Models[modules/sd_models.py]
    Pipe --> Samplers[modules/sd_samplers*.py]
    Pipe --> Scripts[modules/scripts.py<br/>ScriptRunner]
    Scripts --> Exts[extensions/ + extensions-builtin/]
    Models -->|loads .safetensors| Disk[(models/Stable-diffusion)]
    Pipe --> Images[modules/images.py<br/>save + grids]

Boot sequence

The user runs python launch.py (or webui.sh / webui.bat, which create a venv and then invoke launch.py). Boot proceeds in three stages.

  1. Environment preparation. launch.py imports modules/launch_utils.py and calls prepare_environment(). That function pip-installs PyTorch (CUDA, ROCm, MPS, IPEX or CPU depending on flags), requirements_versions.txt, xFormers if requested, and clones four upstream repos (stable-diffusion-stability-ai, BLIP, k-diffusion, generative-models) into repositories/.

  2. Module imports and version checks. webui.py calls initialize.imports() and initialize.check_versions() from modules/initialize.py. This pre-imports torch/gradio/transformers and warns if installed package versions don't match the locked versions.

  3. Application startup. Either webui() (with UI) or api_only() is called. Both call initialize.initialize(), which loads command-line options, sets devices, scans extensions, loads scripts, sets up samplers, loads the default checkpoint, and registers signal handlers. After that the Gradio Blocks UI is built (ui.create_ui()) or a bare FastAPI app is created, the API routes are registered, and the server starts listening (default port 7860).

A startup-time profiler (modules/timer.py) records each phase. The summary printed on stdout (Startup time: ...) comes from startup_timer.summary().

The processing pipeline

When the user clicks Generate or hits /sdapi/v1/txt2img, control flows through:

sequenceDiagram
    participant Client
    participant UI as ui.py / api.py
    participant Q as call_queue.queue_lock
    participant P as processing.process_images
    participant S as ScriptRunner
    participant Sampler
    participant SDModel as sd_model
    participant Images as images.save_image

    Client->>UI: prompt, params, scripts
    UI->>Q: acquire lock (mutex)
    UI->>P: StableDiffusionProcessingTxt2Img(...)
    P->>S: process(p)
    S->>S: run alwayson scripts (ControlNet, ADetailer, ...)
    P->>SDModel: encode prompt
    P->>Sampler: sample(steps, seed, ...)
    Sampler-->>P: latent
    P->>SDModel: decode_first_stage
    P->>S: postprocess_image(p, image)
    P->>Images: save_image(image, path)
    P-->>UI: Processed (images, infotexts)
    UI->>Q: release lock
    UI-->>Client: response

Two key abstractions live in modules/processing.py:

  • StableDiffusionProcessing — the dataclass that captures every input parameter for a generation job. It has Txt2Img and Img2Img subclasses.
  • Processed — the result object returned to the caller (images, infotexts, seeds, comments).

process_images_inner() is the actual loop: it runs scripts at every callback point, does the prompt encoding (with modules/prompt_parser.py), creates noise, runs the sampler, applies face restoration and upscaling, saves images, and returns.

Subsystems

The codebase doesn't use a hard "layers" architecture. Instead it has a flat set of cooperating modules under modules/. The wiki groups them into the following categories:

Category What it owns Wiki page
UI Gradio Blocks construction, tabs, components, settings page systems/ui.md
Processing Txt2img/img2img main pipeline systems/processing.md
Models Checkpoint discovery, loading, switching, hashing, VAE systems/models.md
Samplers & schedulers Wrappers around k-diffusion, LCM, timestep samplers systems/samplers-and-schedulers.md
SD hijack Monkey-patches the upstream LDM model for prompt syntax, attention optimisations systems/sd-hijack.md
Scripts & extensions The plugin system; alwayson, txt2img-only, img2img-only scripts systems/scripts-and-extensions.md
Script callbacks The 30+ named hooks extensions can subscribe to systems/script-callbacks.md
Extra networks Lora/embedding/hypernetwork browser, prompt syntax <lora:name:weight> systems/extra-networks.md
Postprocessing Upscalers, face restoration, the Extras tab pipeline systems/postprocessing.md
Training Textual inversion and hypernetwork training features/training.md
API FastAPI routes and Pydantic models api/index.md

Threading and concurrency

The app is mostly single-threaded for GPU work:

  • A FIFO queue_lock (modules/fifo_lock.py, wrapped by wrap_gradio_gpu_call in modules/call_queue.py) serialises all generation jobs.
  • Gradio runs its own worker threads for HTTP handling, which is why the lock exists.
  • Some side tasks run asynchronously: modules/extensions.py reads commit info via threads, modules/progress.py polls a task queue, and the --api mode can interleave API calls with UI requests but the lock still applies.
  • Restart and reload are implemented by closing shared.demo, re-running initialize.initialize_rest(), and re-creating the Gradio UI in a while 1 loop in webui.py — the process never exits unless the user requests a stop.

Configuration plumbing

There are three levels of configuration, all exposed through modules/shared.py:

  1. Command-line flags (shared.cmd_opts) — defined in modules/cmd_args.py. Examples: --medvram, --xformers, --api, --listen. About 130 flags.
  2. User options (shared.opts) — declared in modules/shared_options.py, persisted to config.json in the data dir, edited from the Settings tab. Around 400 settings.
  3. UI state (ui-config.json) — captures slider/dropdown defaults; managed by modules/ui_loadsave.py.

Extensions can register their own settings by calling shared.options_templates.update(...) in their on_ui_settings callback.

For a deeper view of any individual subsystem, follow the links above or jump to the systems/ index.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Architecture – Stable Diffusion WebUI wiki | Factory