AUTOMATIC1111/stable-diffusion-webui

Hires fix and upscaling workflow

Active contributors: AUTOMATIC1111, Kohaku-Blueleaf, w-e-w

Purpose

Hires fix is a two-pass txt2img workflow: generate at a smaller "native" resolution, then upscale and run a second sampling pass to add detail. It is the primary trick for getting good 2K+ images out of a 512×512-trained model.

The Extras tab and /sdapi/v1/extra-*-image provide the standalone upscale workflow — pure post-processing on existing images. They share the upscaler implementations but otherwise have no overlap.

Two passes

graph LR
    Prompt --> Pass1[Sampler at native res<br/>e.g. 832×512]
    Pass1 -->|latent or pixel upscale| Up[Upscale to target res<br/>e.g. 1664×1024]
    Up --> Encode[(if pixel upscaler) VAE encode]
    Encode --> Pass2[Sampler at target res<br/>denoising_strength steps]
    Pass2 --> Decode[VAE decode]
    Decode --> Image

The implementation is in StableDiffusionProcessingTxt2Img.sample() (modules/processing.py). Latent upscale skips the VAE round-trip; pixel upscale (ESRGAN, SwinIR, etc.) goes through it.

Hires-fix parameters

Field	Meaning
`enable_hr`	Master toggle.
`hr_scale`	Multiplier on width/height for the second pass (default 2.0).
`hr_resize_x`, `hr_resize_y`	Absolute target dimensions; if non-zero, override `hr_scale`.
`hr_upscaler`	Which upscaler to use between passes. The "Latent" entry uses bicubic on latents; everything else uses a model upscaler with VAE round-trip.
`hr_second_pass_steps`	Steps for the second sampling pass; default 0 means "use main steps".
`denoising_strength`	How much the second pass changes the image. Lower = more faithful to first pass; higher = more detail / more drift. 0.4–0.6 is the typical range.
`hr_sampler_name`, `hr_scheduler`	Override the sampler/scheduler for pass two; default "Use same sampler/scheduler".
`hr_prompt`, `hr_negative_prompt`	Optional override prompts for pass two.
`hr_cfg`, `hr_distilled_cfg`	CFG scale override for pass two (defaults to the main `cfg_scale`).

In the API, these are top-level fields on StableDiffusionTxt2ImgProcessingAPI.

Hooks

Two callbacks are specific to hires fix:

before_hr — Script.before_hr(p, *args) is called between the two passes. Used by extensions that want to alter p before the second sample (changing the prompt, switching ControlNet weights).
process_before_every_sampling — added in v1.10. Called each time a sampler is created, including pass two. The refiner uses this to swap models.

Latent vs pixel upscalers

Upscaler entry	Behaviour
`Latent`	Bicubic interpolate latents. Cheapest but blurry. Increase `denoising_strength` to compensate.
`Latent (antialiased)`, `Latent (bicubic)`, `Latent (bicubic antialiased)`, `Latent (nearest)`, `Latent (nearest-exact)`	Variants on the latent interpolation.
`ESRGAN`, `RealESRGAN`, `SwinIR`, `ScuNET`, `LDSR`, `DAT`, `HAT`	Pixel-domain. Decode latents → upscale image → re-encode. Sharper but ~30–50% slower per pass.
`4x-UltraSharp`, etc.	These are individual ESRGAN model files; the entries are populated from `models/ESRGAN/`.

The list is generated from shared.latent_upscale_modes + [x.name for x in shared.sd_upscalers]. Extensions can register more by adding to shared.sd_upscalers (see systems/postprocessing.md).

VAE round-trip cost

For pixel upscalers, the second pass does:

VAE decode pass-one latents (~one second for SDXL).
Upscaler forward pass (varies by upscaler; LDSR is the slowest at tens of seconds).
VAE encode upscaled image.
Re-noise using denoising_strength.
Sampler at target resolution.

Step 5 is the dominant cost on most GPUs. --medvram will move the upscaler off-GPU between phases to keep VRAM available for the second sampler.

Standalone upscale (Extras tab)

The Extras tab is independent of hires fix. It loads an image (no diffusion model required), runs one or two upscalers in sequence, optionally chains face restoration and other postprocessors, and saves. See systems/postprocessing.md.

POST /sdapi/v1/extra-single-image and /extra-batch-images are the API equivalents. Their request shape is ExtrasSingleImageRequest / ExtrasBatchImagesRequest in modules/api/models.py.

Common pitfalls

High denoising_strength + Latent upscale = blurry slop. Latent upscaling needs aggressive denoising to look sharp; pixel upscalers need less.
Hires fix is incompatible with some samplers (LCM, DPM Adaptive). The UI just runs them anyway; behaviour is undefined.
hr_resize_x / hr_resize_y not multiples of 8 were rejected in older versions; v1.4 relaxed this to multiples of 8 (instead of 64).
OOM on hires — switch from latent to a tile-based upscaler (ESRGAN/SwinIR), reduce hr_scale, or pass --medvram.

Entry points for modification

Add a hires-pass behaviour — register before_hr and process_before_every_sampling callbacks; these run at the right point.
Add a custom upscaler — see systems/postprocessing.md. The new upscaler will appear in the hires-fix dropdown automatically.
Change the noise mix between passes — the relevant code is around setup_img2img_steps in processing.py; bear in mind every change here affects both regular img2img and hires-pass-two.

Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.