AUTOMATIC1111/stable-diffusion-webui
Text-to-image (txt2img)
Active contributors: AUTOMATIC1111, w-e-w, catboxanon, missionfloyd
Purpose
The flagship workflow: user supplies a prompt, model generates an image from pure noise. txt2img is the simplest path through the pipeline; everything else is a variation on top.
Where the code lives
| Concern | File |
|---|---|
| The Generate button handler | modules/txt2img.py |
| The tab UI construction | modules/ui.py (the with gr.Tab("txt2img") block) |
| The processing dataclass | StableDiffusionProcessingTxt2Img in modules/processing.py |
| The API route | text2imgapi in modules/api/api.py |
| Pydantic request/response | StableDiffusionTxt2ImgProcessingAPI, TextToImageResponse in modules/api/models.py |
Request lifecycle
sequenceDiagram
participant User
participant Tab as txt2img tab
participant Handler as txt2img.txt2img()
participant Lock as queue_lock
participant P as StableDiffusionProcessingTxt2Img
participant Pipe as process_images
participant Save as images.save_image
User->>Tab: click Generate
Tab->>Handler: wrap_gradio_gpu_call wrapper
Handler->>Lock: acquire
Handler->>P: build with all controls
Handler->>Pipe: process_images(p)
Pipe->>Save: per image
Save-->>Pipe: file path
Pipe-->>Handler: Processed
Handler->>Lock: release
Handler-->>Tab: gallery + infotexts + JSONmodules/txt2img.py is small (~120 lines). It receives the long *args tuple from Gradio, slices off the script-args portion, builds a StableDiffusionProcessingTxt2Img and calls process_images(p).
The same handler is reused by the Generate button and by the API endpoint — text2imgapi constructs the same dataclass directly from the Pydantic request body.
What txt2img-specific fields exist
StableDiffusionProcessingTxt2Img (in modules/processing.py around line 1166) adds these fields on top of the base StableDiffusionProcessing:
enable_hr,denoising_strength,hr_scale,hr_upscaler,hr_second_pass_steps,hr_resize_x,hr_resize_y,hr_sampler_name,hr_scheduler,hr_prompt,hr_negative_prompt,hr_cfg,hr_distilled_cfg— the hires-fix knobs. See hires-fix.md.firstphase_width/firstphase_height— legacy aliases for the low-res pass dimensions.- The
sample()method override, which runs the sampler once at low res and (if enabled) again at high res.
The base class fields (prompt, sampler_name, steps, cfg_scale, seed, batch_size, n_iter, width, height, scripts, …) are documented inline in processing.py.
What the user sees
The txt2img tab layout is built in modules/ui.py:
| Section | Source |
|---|---|
| Prompt + negative prompt + Generate button | Toprow from modules/ui_toprow.py |
| Sampler + scheduler dropdowns | rendered by the sampler alwayson script in modules/processing_scripts/sampler.py |
| Steps, CFG scale, width, height, batch | direct Gradio sliders in ui.py |
| Seed and subseed controls | rendered by processing_scripts/seed.py |
| Refiner accordion | processing_scripts/refiner.py |
| Hires fix accordion | inline in ui.py, controls listed above |
| Script dropdown | the ScriptRunner for txt2img |
| Always-on script panels | each script's ui() is rendered below |
| Output gallery + Save / Send to img2img / etc. | create_output_panel in modules/ui_common.py |
Infotext round-trip
When the user pastes generation parameters (a string like prompt\nNegative prompt: ...\nSteps: 20, Sampler: Euler a, ...), modules/infotext_utils.py parses each Key: value pair and assigns it to the matching (component, key) pair from the txt2img tab's paste_fields. The paste_fields list is registered alongside the controls in ui.py. Extensions can append to this list to participate in the round-trip — most do via Script.infotext_fields = [(component, "My Setting"), ...].
script_callbacks.on_infotext_pasted(callback) is the catch-all hook for "infotext arrived; do something with it before the components are filled".
Special operations
- Skip / Interrupt — the toprow buttons set
shared.state.skipped/shared.state.interrupted. The sampler checks these every step and exits early. - Live preview — every N steps, the latents are decoded with the approximate VAE (
modules/sd_vae_approx.py) or TAESD (modules/sd_vae_taesd.py) and pushed to the UI throughprogress.py. - Extra noise —
opts.eta_noise_seed_deltaand the correspondingextra_noisecallback let extensions inject custom noise (used by some adversarial tools). - Generate forever — a UI-only loop driven by JavaScript that re-clicks Generate until the user clicks Cancel.
API parity
POST /sdapi/v1/txt2img accepts the same parameters as the UI plus a few extras (script_args, script_name, alwayson_scripts). The default response includes the generated images base64-encoded plus a stringified parameters for round-tripping. The API and UI feed into the exact same process_images(p) call, so behaviour is identical.
The Pydantic models in modules/api/models.py are partly auto-derived from StableDiffusionProcessingTxt2Img field annotations — adding a new field to the dataclass usually exposes it through the API automatically.
Entry points for modification
- Tweak the default sampler/steps — Settings → Sampler parameters / Defaults. Or set them in
webui-user.shvia thedefault_*script callbacks. - Add a control to the tab — register an alwayson script (see scripts-and-extensions.md) rather than editing
ui.py. - Validate the input differently — extend the
setuphook on a script. - Change the underlying loop — the loop is in
process_images_inner(see systems/processing.md). Most "I want to do X every step" features should usecfg_denoisercallbacks instead.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.
Previous
Features
Next
Image-to-image, inpainting, outpainting