Factory.ai

Open-Source Wikis

/

Stable Diffusion WebUI

/

Systems

/

Processing pipeline

AUTOMATIC1111/stable-diffusion-webui

Processing pipeline

Active contributors: AUTOMATIC1111, w-e-w, catboxanon, Kohaku-Blueleaf, light-and-ray

Purpose

The processing pipeline is the core of the application: it takes a fully-populated StableDiffusionProcessing instance, runs the diffusion sampler, applies any post-processing, saves the resulting images, and returns a Processed object. Both the UI handlers (txt2img.py, img2img.py) and the API endpoints (/sdapi/v1/txt2img, /sdapi/v1/img2img) funnel into the same code in modules/processing.py.

Directory layout

modules/
├── processing.py                   # ~1,800 lines: pipeline + dataclasses
├── processing_scripts/             # alwayson built-ins
│   ├── seed.py                     # the seed/subseed UI and infotext registration
│   ├── sampler.py                  # the sampler/scheduler UI
│   ├── refiner.py                  # SDXL refiner mid-generation switch
│   └── comments.py                 # extracts # comments from prompts
├── prompt_parser.py                # Lark grammar for attention/scheduling/AND
├── sd_samplers*.py                 # individual sampler implementations
├── rng.py / rng_philox.py          # CPU + Philox seedable noise
├── lowvram.py                      # weight-shuffling for low-VRAM mode
├── extra_networks.py               # parses <lora:foo:1.0> tokens out of the prompt
└── images.py                       # save_image, grid building, infotext writing

Key abstractions

Type File Description
StableDiffusionProcessing modules/processing.py Dataclass with every parameter for a generation job. ~50 fields.
StableDiffusionProcessingTxt2Img same Adds hires-fix fields (enable_hr, hr_scale, hr_upscaler, …).
StableDiffusionProcessingImg2Img same Adds img2img fields (init_images, mask, denoising_strength, inpaint_full_res, …).
Processed same Result; carries images, infotexts, all_seeds, all_subseeds, comments.
process_images(p) same Public entry point; wraps process_images_inner(p) with model swap and lock.
process_images_inner(p) same The actual loop; ~290 lines covering script hooks, sampling, post-processing, saving.
create_infotext(...) same Builds the metadata string saved into PNG/EXIF.
decode_latent_batch(...) same VAE decode with NaN handling and lowvram-aware batching.
apply_overlay(...), apply_color_correction(...) same Post-sample hooks for img2img masking and tonal matching.

How it works

graph TD
    Caller[txt2img.py / api.py] -->|p| PI[process_images]
    PI -->|swap model if needed| SW[sd_models.reload_model_weights]
    PI --> PII[process_images_inner]

    PII -->|fix_seed| Seed
    PII -->|extra_networks.parse_prompts| EN[extra_networks.activate]
    EN --> Loras[Lora / hypernet / TI patches]
    PII -->|setup_conds| Cond[prompt_parser + CLIP]
    PII -->|p.scripts.process| Scripts1[alwayson scripts: process]

    PII --> Loop{for batch in batches}
    Loop -->|create noise| RNG
    Loop -->|p.sample| Sampler
    Sampler --> CFGDenoiser
    CFGDenoiser --> UNet
    Loop -->|p.scripts.process_batch / before_hr| Scripts2[scripts: per-batch hooks]
    Loop -->|decode_first_stage| VAE
    Loop -->|p.scripts.postprocess_image| Scripts3[scripts: postprocess_image]
    Loop -->|face restoration / color correction / overlay| Post
    Loop -->|images.save_image| Save
    Loop --> Loop

    PII -->|p.scripts.postprocess| Scripts4[scripts: postprocess]
    PII -->|extra_networks.deactivate| END
    PII --> R[Processed]
    R --> Caller

The full loop, with line references in modules/processing.py:

  1. Setupprocess_images() (line ~819): saves the current sd_model_checkpoint and sd_vae settings, and ensures the right model is loaded if the request specifies an override.
  2. Inner pipelineprocess_images_inner() (line ~863):
    1. Set seed (fix_seed), build comments dict, set state.job_count.
    2. Parse extra-network tokens out of the prompt with extra_networks.parse_prompts(). This rewrites prompt to remove <lora:foo:1.0> and produces an activation list passed to extra_networks.activate(p, ...) later.
    3. Encode the prompt and negative prompt into conditioning with setup_conds(). This handles attention syntax, prompt scheduling ([a:b:0.5]), and AND-composable diffusion via modules/prompt_parser.py.
    4. Call p.scripts.process(p) — alwayson scripts get a chance to mutate p before sampling.
    5. For each batch (for n in range(p.n_iter)):
      • Generate noise via modules/rng.py.
      • Call p.sample() (subclass-specific). This in turn calls create_sampler() from modules/sd_samplers.py and runs the chosen sampler.
      • Hires fix (Txt2Img only): if enable_hr, upscale latents (or decode/re-encode for tile-based upscalers), then run a second pass with the hires sampler.
      • Refiner (SDXL): if a refiner model is set, switch checkpoints at the configured step. See modules/processing_scripts/refiner.py.
      • VAE decode → clamp → convert to PIL.
      • Run face restoration if enabled.
      • Apply color correction (img2img with apply_color_correction).
      • Apply mask overlay for inpainting (apply_overlay).
      • p.scripts.postprocess_image(p, image) — alwayson scripts can edit the final image.
      • images.save_image(image, ...) — write the file with infotext metadata.
    6. Build Processed with infotexts (one per image) and return.

Hires fix

The hires-fix path is internal to StableDiffusionProcessingTxt2Img.sample(). It runs the regular sampler at low res, then either:

  • Latent upscale — bicubic upscale the latent and run the sampler again at the new size (cheap, can blur).
  • Pixel upscale — VAE-decode, run the chosen upscaler (ESRGAN, SwinIR, …), VAE-encode, and run the sampler again (slower, sharper).

Hires hooks: before_hr (alwayson scripts) is called between the two passes.

Refiner

The SDXL refiner is implemented as an "alwayson script" in modules/processing_scripts/refiner.py. It registers before_sampling and process_before_every_sampling hooks; when the configured refiner_switch_at is reached, the script swaps shared.sd_model to the refiner checkpoint and lets the rest of the loop continue. Memory-wise this can be expensive — --medvram-sdxl exists specifically for this case.

Prompt parsing and AND-composable diffusion

The Lark grammar in modules/prompt_parser.py handles three syntaxes:

  • Attention: (word) → 1.1×, [word] → 1/1.1×, (word:1.5) → explicit weight. Implemented as token-weight pairs that the CLIP hijack uses to scale embeddings.
  • Prompt scheduling: [from:to:0.5] switches fromto halfway through sampling. Implemented as a list of (end_step, prompt) pairs; setup_conds re-encodes when steps cross a boundary.
  • AND-composable diffusion: prompt1 AND prompt2 :1.2 runs the model on both prompts and blends the predicted noise. Implemented in sd_samplers_cfg_denoiser.py.

These are independent and can be combined. Negative prompts share the same syntax.

Integration points

  • Scripts hook nearly every step via the ScriptRunner — see scripts-and-extensions.md for the full list.
  • cfg_denoiser and cfg_denoised callbacks let extensions modify the noise prediction at every step — see script-callbacks.md.
  • extra_noise callback lets extensions inject custom noise before sampling.
  • process_before_every_sampling (added v1.10) lets refiner-style switches run on each sub-call (hires fix counts as a separate sampling).
  • image_saved and before_image_saved callbacks let extensions write sidecar files or change the PNG-info dict.

Entry points for modification

  • A new generation parameter — add a field to StableDiffusionProcessing, append a (component, key) to the relevant tab's paste_fields, and update create_infotext() to include it. The API will pick it up automatically because modules/api/models.py is generated from StableDiffusionProcessing introspection.
  • A new sampler — see samplers-and-schedulers.md.
  • A new pipeline stage (e.g., a new mid-sample transformation) — register a script callback (cfg_denoiser, process_before_every_sampling, etc.) rather than editing process_images_inner. Internal use of these hooks is established (the refiner already does this).
  • Image saving — extend images.save_image() (modules/images.py). Filename patterns are documented in images.FilenameGenerator.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.

Processing pipeline – Stable Diffusion WebUI wiki | Factory