AUTOMATIC1111/stable-diffusion-webui
SD hijack
Active contributors: AUTOMATIC1111, brkirch, Kohaku-Blueleaf, Aarni Koskela
Purpose
Monkey-patches the upstream ldm (Latent Diffusion Model) and sgm (generative-models) packages at runtime so the webui's prompt syntax, attention optimisations, embeddings, and clip-skip work without forking those repositories. Every loaded model passes through model_hijack.hijack(sd_model) before it's used.
Directory layout
modules/
├── sd_hijack.py # the master hijack: applies/undoes patches; central StableDiffusionModelHijack
├── sd_hijack_clip.py # FrozenCLIPEmbedderWithCustomWords + attention parsing
├── sd_hijack_clip_old.py # legacy emphasis behaviour for compatibility
├── sd_hijack_open_clip.py # OpenCLIP variant (used by SDXL/SD2)
├── sd_hijack_xlmr.py # XLM-R encoder (Alt-Diffusion m9)
├── sd_hijack_unet.py # UNet upcasting and dtype shenanigans
├── sd_hijack_optimizations.py # cross-attention optimisation dispatcher
├── sd_hijack_ip2p.py # InstructPix2Pix-specific patches
├── sd_hijack_utils.py # CondFunc — conditional patcher utility
├── sd_hijack_checkpoint.py # nn.Module checkpoint hijack for memory savings
├── sd_emphasis.py # emphasis-as-multiplier helpers
├── sd_disable_initialization.py # skips default-weight init when loading
└── xlmr.py / xlmr_m18.py # XLM-R embedding pieces for Alt-DiffusionKey abstractions
| Type | File | Description |
|---|---|---|
StableDiffusionModelHijack |
modules/sd_hijack.py |
Holds embedding database, clip wrappers, optimization state. Public methods hijack() / undo_hijack() / apply_optimizations(). |
model_hijack |
same | Module-level singleton instance. |
FrozenCLIPEmbedderWithCustomWords |
modules/sd_hijack_clip.py |
Replaces ldm's CLIP wrapper; parses (emphasis), [neg], (word:1.2), BREAK and chunked prompts. |
EmbeddingDatabase |
modules/textual_inversion/textual_inversion.py |
Owned by model_hijack; maps trained embedding names to vectors. |
SdOptimization (subclasses) |
modules/sd_hijack_optimizations.py |
Per-strategy class: xformers, sdp, sdp_no_mem, sub_quad, v1, InvokeAI, doggettx, none. |
CondFunc |
modules/sd_hijack_utils.py |
Helper to swap a function for one that calls a different impl when a runtime predicate is true. |
What the hijack does
When a checkpoint is loaded, model_hijack.hijack(sd_model) runs in sd_models.load_model_weights():
- Replace CLIP / OpenCLIP — swaps the model's text-encoder wrapper with one of the
*WithCustomWordsclasses. These understand the prompt syntax and integrate the embedding database. - Apply attention optimization — picks the best
SdOptimizationbased on flags (--xformers,--opt-sdp-attention, …) and patches the relevantforwardmethods. - UNet dtype patches — for fp16 + xformers + certain GPUs,
modules/sd_hijack_unet.pyupcasts a few problem layers to fp32. - Embedding database refresh — scans
embeddings/andmodels/embeddings/and registers each.pt/.safetensorsfile. Their tokens become recognisable in prompts. - Disable default init — wraps
nn.Linear/nn.Conv2dconstructors to skip Kaiming init since the weights will be overwritten anyway. This is what--disable-model-loading-ram-optimizationturns off.
model_hijack.undo_hijack(sd_model) reverses all of the above when the model is unloaded or replaced.
CLIP wrapper details
graph LR
Prompt --> Parse[prompt_parser:<br/>(word:1.2), [a:b:0.5], …]
Parse --> Tokens[token list + per-token weights]
Tokens --> Chunks[75-token chunks with BOS/EOS]
Chunks --> EmbDB[EmbeddingDatabase<br/>substitute trained tokens]
EmbDB --> CLIP[CLIP encode_with_transformer]
CLIP --> Multiply[emphasis multiplier per token]
Multiply --> Mean[adjust mean to keep distribution]
Mean --> Out[conditioning tensor]Two pieces are non-obvious:
- Chunking: prompts longer than 75 tokens are split into chunks; each chunk gets BOS/EOS tokens. The encoder runs once per chunk; the results are concatenated. The
BREAKkeyword forces a chunk boundary. - Emphasis:
(word:1.2)produces a per-token weight. After encoding, that weight is multiplied into the embedding. To avoid drifting the distribution, the mean is then re-aligned. The exact algorithm has switched once (sd_hijack_clip_old.pyis the v1.0-era behaviour, kept for "Use old emphasis implementation" compatibility — see the option inmodules/shared_options.py). - Clip skip:
opts.CLIP_stop_at_last_layers > 1returns then-th-from-last hidden state instead of the final one. Useful for anime models trained on the second-to-last layer.
Attention optimisations
modules/sd_hijack_optimizations.py is the dispatcher. Each strategy is an SdOptimization subclass:
| Class | Triggered by | Notes |
|---|---|---|
SdOptimizationXformers |
--xformers and xformers package available |
Memory-efficient attention via FAIR's xFormers |
SdOptimizationSdpNoMem |
--opt-sdp-no-mem-attention |
PyTorch 2.x SDPA without memory-efficient flag (deterministic) |
SdOptimizationSdp |
--opt-sdp-attention |
PyTorch 2.x SDPA |
SdOptimizationSubQuad |
--opt-sub-quad-attention |
Birch-san / AminRezaei0x443's chunked attention |
SdOptimizationV1 |
--opt-split-attention-v1 |
Older split attention path |
SdOptimizationInvokeAI |
--opt-split-attention-invokeai |
InvokeAI/lstein's split-attention |
SdOptimizationDoggettx |
--opt-split-attention |
The original Doggettx implementation |
If no flag is passed, SdOptimization.is_available() is consulted in priority order; xFormers > SDP > Doggettx is the typical fallback. Extensions can add more by registering an on_list_optimizers callback.
sd_hijack_optimizations.py is the single largest "math" file in the repo (~677 lines) — most of it is the attention math implementations themselves.
Where embeddings live
The EmbeddingDatabase is on model_hijack because it depends on the active text encoder's embedding shape. Loading flow:
model_hijack.embedding_db.add_embedding_dir(path)registers a directory.- After
hijack(),model_hijack.embedding_db.load_textual_inversion_embeddings()scans those dirs. - Each embedding becomes a
Embeddingobject withvec(the actual tensor),name(the token),step,sd_checkpoint, etc. - The CLIP wrapper's tokenizer recognises the name and substitutes the vector at encode time.
/sdapi/v1/embeddings, /sdapi/v1/refresh-embeddings, and the Train tab are the user-facing entry points.
Integration points
script_callbacks.on_model_loaded— fires afterhijack()is done, so callbacks see a fully patched model.script_callbacks.on_list_optimizers— register custom attention optimisers.script_callbacks.on_list_unets— return alternative UNet implementations the user can pick. Tied tomodules/sd_unet.py.- The
--disable-opt-split-attentionflag forces theSdOptimizationNonepath (no optimisation) — useful for debugging numeric issues.
Entry points for modification
- Add a new attention optimisation — subclass
SdOptimization, implementapply()andundo(), and register it viaon_list_optimizers. - Add a new prompt syntax — extend the Lark grammar in
modules/prompt_parser.pyand updateFrozenCLIPEmbedderWithCustomWords.process_text(). - Hijack a different upstream class — add a function inside
StableDiffusionModelHijack.hijack()and a matching reversal inundo_hijack(). Always store originals so the reversal is correct. - Avoid hijacking when possible — script callbacks (
cfg_denoiser,extra_noise, …) cover most "I want to insert behaviour at X step" cases without needing to monkey-patchldm.
Built by Factory AutoWiki from public repository content. It is a generated preview for codebase exploration, not source-maintained documentation.
Previous
Samplers and schedulers
Next
Scripts and extensions