FLUX Klein 9B Inference¶

★★★★★ Intermediate

Practical reference for FLUX.2 Klein 9B image generation. Covers optimal sampler settings, multi-pass upscaling, LoRA stacking, anatomy fixes, and prompting patterns.

Distilled 9B (4-Step Model)¶

Recommended Settings¶

Parameter	Value	Notes
Steps	4	Designed for 4-step inference
CFG	1.0	Official recommendation. Never exceed 2.0
Sampler	euler	Most stable for edges and text sharpness
Scheduler	simple	Alternative: Flux2Scheduler (resolution-aware)
Denoise	1.0	For txt2img

CFG above 2.0 produces "deep-fried" artifacts - the distilled model has guidance baked in. Using 50+ steps wastes compute and actively degrades quality.

Anatomy Fix: Use 8 Steps¶

At 4 steps, complex poses (seated, two-person) often produce extra limbs. 8 steps shows dramatic improvement for anatomy without excessive slowdown. Beyond 8 steps on distilled: diminishing returns.

Base 9B (Non-Distilled)¶

Recommended Settings¶

Parameter	Value	Notes
Steps	20-24	Sweet spot. Below 15 loses detail.
CFG	3.5-5.0	5.0 gives best prompt adherence
Sampler	euler	Most stable overall
Scheduler	simple	sgm_uniform for upscale passes
Denoise	1.0	For txt2img

Why Base Over Distilled¶

Higher output diversity
Configurable quality/speed tradeoff
Better for LoRA training and custom pipelines
Better input for downstream upscalers

Sampler Deep-Dive¶

Sampler	Scheduler	Use Case
euler	simple	Default gold standard
euler	Flux2Scheduler	Resolution-aware, adapts to aspect ratio
res_2s	simple	Better anatomy (2x compute/step)
res_2m	ddim_uniform	High quality general purpose
dpmpp_2m_ancestral	sgm_uniform	Analog grain, cinematic/film texture
euler_ancestral_cfg++	sgm_uniform	Detail enhancement in upscale passes

Key insight: res_2s at 4 steps equals euler at 8 steps in compute. Fixes anatomy without increasing step count explicitly.

Multi-Pass Upscaling¶

Base (Stage 1) + Distilled (Stage 2)¶

[Prompt] -> Klein Base 9B, 1024x1024, 20 steps, CFG 5.0, euler/simple
         -> Upscale Latent 2x -> 2048x2048
         -> Klein Distilled 9B, 4-8 steps, CFG 1.0, euler/simple, denoise 0.4-0.6
         -> VAE Decode -> 2048x2048 output

Fast Pipeline (Distilled Only)¶

[Prompt] -> Klein Distilled 9B, 1024x1024, 8 steps, CFG 1.0, euler/simple
         -> Image Resize -> 2048 longest side
         -> Klein Distilled upscale, 4 steps, euler_ancestral_cfg++,
            sgm_uniform, denoise 0.8, s_noise 1.2
         -> 2048x2048 output

Denoise Values for Second Pass¶

Goal	Denoise	Effect
Tight fidelity	0.3-0.4	Structure lock, minimal change
Balanced	0.5-0.6	Good detail, some creativity
Creative upscale	0.7-0.8	More prompt-driven
Detail enhancement	0.8	Best for realistic fine detail

4K+ Output¶

Add tiled upscale stage: 4x4 grid, 256px tiles with 128px overlap, seam_fix_denoise=1.0, seam_fix_width=32-128px. Florence2 auto-caption before upscale reduces hallucinations at tile borders.

Seam Artifacts¶

Causes: tile boundary context loss, latent space discontinuity, high denoise in second pass.

Fixes: 128px overlap padding, mask blur 12-16px, half-tile reprocessing at boundaries, band-pass filtering (retain high-freq detail, blend low-freq).

LoRA Stacking¶

Capacity and Strength¶

Up to 3 LoRAs simultaneously, each with individual weight (0-4).

Strength	Effect
0.0-0.3	Subtle, barely visible
0.4-0.75	Sweet spot - balanced texture + coherence
0.73	Recommended default for single LoRA
0.8-1.0	Maximum texture, starts pulling apart
1.0+	Coherence loss, visible artifacts

Multi-LoRA Rules¶

Stack strongest-influence LoRA first (processed sequentially)
Total combined strength should stay under 1.5-2.0
If artifacts appear, reduce weakest LoRA first
Watch for style conflicts (two different color grading LoRAs)

Klein vs Dev LoRA Behavior¶

Same LoRA settings produce fundamentally different results: - Klein: heavier grain structure (16mm film look) - Dev: cleaner, more like 35mm film - FP8 Klein: maintains desirable grain; non-FP8 is cleaner

Anatomy Problem Hierarchy¶

From most to least effective:

Increase steps (4 -> 8): strongest lever for complex poses
Simplify pose: standing > seated, single > multi-person, front > twisted
Adjust CFG carefully: 1.0 baseline, 1.2 can fix fused fingers, >1.5 risks new problems
Use res_2s sampler: doubles compute per step, fixes anatomy implicitly
Negative prompting: "distorted features, unnatural proportions, extra limbs"
Use Base model: 20 steps rarely has anatomy issues (much slower)

Face Blur Fix¶

Increase steps (most effective)
Higher resolution (1536x1920 for portraits)
Flux2Klein-Enhancer node to boost text conditioning magnitude

Flux2Klein-Enhancer Node¶

Custom node for stronger prompt adherence:

Parameter	Range	Default	Purpose
Magnitude	0.0-3.0	1.0	Text embedding scaling
Contrast	-1.0-2.0	0.0	Token difference amplification
Normalize Strength	0.0-1.0	0.0	Token magnitude equalization
Edit Weight	0.0-3.0	1.0	Preservation vs prompt following
Ref Strength	0.0-5.0	1.0	Reference structure lock (0=txt2img)
Blend with Noise	0.0-1.0	0.0	Reference-noise interpolation

Dampening 1.20-1.30 recommended for precise preservation.

Prompting Guide¶

Structure¶

Subject -> Setting -> Details -> Lighting -> Atmosphere

Write as flowing prose, not keyword lists. Front-load important elements.

Lighting (Highest Impact)¶

Specify: source type, quality, direction, temperature, surface interaction. "Soft, diffused natural light filtering through sheer curtains" >> "good lighting"

Prompt Length¶

Length	Words	Use
Short	10-30	Concept exploration
Medium	30-80	Production work
Long	80-300+	Complex editorial/product shots

Style Annotations (End of Prompt)¶

"Style: Country chic meets luxury lifestyle editorial"
"Shot on 35mm film with shallow depth of field"
"Mood: Serene, romantic, grounded"

Native Resolutions¶

Both 4B and 9B support 11 aspect ratios up to 4MP (2048x2048 square). Range from 1:1 to 21:9.

Gotchas¶

4B LoRA incompatible with 9B: different text encoder sizes (Qwen 3-4B vs Qwen 3-8B). LoRAs are not interchangeable between Klein 4B and Klein 9B.
FP8 specifically benefits from 8 steps: the FP8 quantized distilled model needs more steps than bf16 to match quality. Budget 8 steps minimum for FP8 inference.
Qwen-Image VAE artifacts: the VAE decoder can introduce washed-out details and checkerboard noise. A dedicated fix LoRA exists (strength 1.0, trigger: "Remove compression artifacts. Restore the fine details of the photo."). Only works on Qwen-VAE artifacts - will degrade images from other VAEs.

FLUX Klein 9B Inference¶

Distilled 9B (4-Step Model)¶

Recommended Settings¶

Anatomy Fix: Use 8 Steps¶

Base 9B (Non-Distilled)¶

Recommended Settings¶

Why Base Over Distilled¶

Sampler Deep-Dive¶

Multi-Pass Upscaling¶

Base (Stage 1) + Distilled (Stage 2)¶

Fast Pipeline (Distilled Only)¶

Denoise Values for Second Pass¶

4K+ Output¶

Seam Artifacts¶

LoRA Stacking¶

Capacity and Strength¶

Multi-LoRA Rules¶

Klein vs Dev LoRA Behavior¶

Anatomy Problem Hierarchy¶

Face Blur Fix¶

Flux2Klein-Enhancer Node¶

Prompting Guide¶

Structure¶

Lighting (Highest Impact)¶

Prompt Length¶

Style Annotations (End of Prompt)¶

Native Resolutions¶

Gotchas¶

See Also¶