AUTOMATIC1111/stable-diffusion-webui

Text-to-image (txt2img)

Active contributors: AUTOMATIC1111, w-e-w, catboxanon, missionfloyd

Purpose

The flagship workflow: user supplies a prompt, model generates an image from pure noise. txt2img is the simplest path through the pipeline; everything else is a variation on top.

Where the code lives

Concern	File
The Generate button handler	`modules/txt2img.py`
The tab UI construction	`modules/ui.py` (the `with gr.Tab("txt2img")` block)
The processing dataclass	`StableDiffusionProcessingTxt2Img` in `modules/processing.py`
The API route	`text2imgapi` in `modules/api/api.py`
Pydantic request/response	`StableDiffusionTxt2ImgProcessingAPI`, `TextToImageResponse` in `modules/api/models.py`

Request lifecycle

sequenceDiagram
    participant User
    participant Tab as txt2img tab
    participant Handler as txt2img.txt2img()
    participant Lock as queue_lock
    participant P as StableDiffusionProcessingTxt2Img
    participant Pipe as process_images
    participant Save as images.save_image

    User->>Tab: click Generate
    Tab->>Handler: wrap_gradio_gpu_call wrapper
    Handler->>Lock: acquire
    Handler->>P: build with all controls
    Handler->>Pipe: process_images(p)
    Pipe->>Save: per image
    Save-->>Pipe: file path
    Pipe-->>Handler: Processed
    Handler->>Lock: release
    Handler-->>Tab: gallery + infotexts + JSON

modules/txt2img.py is small (~120 lines). It receives the long *args tuple from Gradio, slices off the script-args portion, builds a StableDiffusionProcessingTxt2Img and calls process_images(p).

The same handler is reused by the Generate button and by the API endpoint — text2imgapi constructs the same dataclass directly from the Pydantic request body.

What txt2img-specific fields exist

StableDiffusionProcessingTxt2Img (in modules/processing.py around line 1166) adds these fields on top of the base StableDiffusionProcessing:

enable_hr, denoising_strength, hr_scale, hr_upscaler, hr_second_pass_steps, hr_resize_x, hr_resize_y, hr_sampler_name, hr_scheduler, hr_prompt, hr_negative_prompt, hr_cfg, hr_distilled_cfg — the hires-fix knobs. See hires-fix.md.
firstphase_width / firstphase_height — legacy aliases for the low-res pass dimensions.
The sample() method override, which runs the sampler once at low res and (if enabled) again at high res.

The base class fields (prompt, sampler_name, steps, cfg_scale, seed, batch_size, n_iter, width, height, scripts, …) are documented inline in processing.py.

What the user sees

The txt2img tab layout is built in modules/ui.py:

Section	Source
Prompt + negative prompt + Generate button	`Toprow` from `modules/ui_toprow.py`
Sampler + scheduler dropdowns	rendered by the `sampler` alwayson script in `modules/processing_scripts/sampler.py`
Steps, CFG scale, width, height, batch	direct Gradio sliders in `ui.py`
Seed and subseed controls	rendered by `processing_scripts/seed.py`
Refiner accordion	`processing_scripts/refiner.py`
Hires fix accordion	inline in `ui.py`, controls listed above
Script dropdown	the `ScriptRunner` for txt2img
Always-on script panels	each script's `ui()` is rendered below
Output gallery + Save / Send to img2img / etc.	`create_output_panel` in `modules/ui_common.py`

Infotext round-trip

When the user pastes generation parameters (a string like prompt\nNegative prompt: ...\nSteps: 20, Sampler: Euler a, ...), modules/infotext_utils.py parses each Key: value pair and assigns it to the matching (component, key) pair from the txt2img tab's paste_fields. The paste_fields list is registered alongside the controls in ui.py. Extensions can append to this list to participate in the round-trip — most do via Script.infotext_fields = [(component, "My Setting"), ...].

script_callbacks.on_infotext_pasted(callback) is the catch-all hook for "infotext arrived; do something with it before the components are filled".

Special operations

Skip / Interrupt — the toprow buttons set shared.state.skipped / shared.state.interrupted. The sampler checks these every step and exits early.
Live preview — every N steps, the latents are decoded with the approximate VAE (modules/sd_vae_approx.py) or TAESD (modules/sd_vae_taesd.py) and pushed to the UI through progress.py.
Extra noise — opts.eta_noise_seed_delta and the corresponding extra_noise callback let extensions inject custom noise (used by some adversarial tools).
Generate forever — a UI-only loop driven by JavaScript that re-clicks Generate until the user clicks Cancel.

API parity

POST /sdapi/v1/txt2img accepts the same parameters as the UI plus a few extras (script_args, script_name, alwayson_scripts). The default response includes the generated images base64-encoded plus a stringified parameters for round-tripping. The API and UI feed into the exact same process_images(p) call, so behaviour is identical.

The Pydantic models in modules/api/models.py are partly auto-derived from StableDiffusionProcessingTxt2Img field annotations — adding a new field to the dataclass usually exposes it through the API automatically.

Entry points for modification

Tweak the default sampler/steps — Settings → Sampler parameters / Defaults. Or set them in webui-user.sh via the default_* script callbacks.
Add a control to the tab — register an alwayson script (see scripts-and-extensions.md) rather than editing ui.py.
Validate the input differently — extend the setup hook on a script.
Change the underlying loop — the loop is in process_images_inner (see systems/processing.md). Most "I want to do X every step" features should use cfg_denoiser callbacks instead.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.