Color Theory for ML Image Generation¶

★★★★★ Intermediate

Applied color theory for diffusion model training, color correction, and palette control. Covers perceptual relationships, spectral foundations, and psychological effects.

Itten's 12-Part Color Wheel¶

Johannes Itten's system underpins most color harmony logic in art and design:

Primary:    Red (0°), Yellow (120°), Blue (240°)
Secondary:  Orange (60°), Green (180°), Violet (300°)
Tertiary:   Red-Orange, Yellow-Orange, Yellow-Green,
            Blue-Green, Blue-Violet, Red-Violet

Complementary Pairs (180° separation)¶

Color	Complement	ML Use
Red	Green	Skin vs foliage; high contrast backgrounds
Blue	Orange	Sky vs warm metals; winter vs autumn
Yellow	Violet	High contrast for text/logo placement
Red-Orange	Blue-Green	Warm portraits on cool backgrounds
Yellow-Green	Red-Violet	Spring tones vs rose accents

For diffusion prompts: complementary pairs create visual tension and high perceived contrast. Analogous colors (adjacent on wheel) create harmony.

Harmony Types¶

Harmony	Definition	Effect
Complementary	2 colors, 180° apart	Bold, energetic
Analogous	3 colors, 30° apart	Calm, cohesive
Triadic	3 colors, 120° apart	Vibrant, balanced
Split-complementary	1 + 2 near-complement	Harmony with contrast
Tetradic	4 colors, 90° apart	Rich, complex
Monochromatic	1 hue, varied saturation/value	Elegant, unified

Itten's Color Ball (3D Model)¶

The color ball extends the wheel into 3D: - Equator: fully saturated hues (12-part wheel) - Top pole: white - Bottom pole: black - Axis: grey scale (desaturation toward center) - Any point on ball = unique HSL coordinate

This maps directly to HSL/HSV color spaces used in image processing.

Volkov's Dual-Color Nature¶

Russian color theorist Nikolai Volkov distinguished two types of color perception:

Local Color vs Radiation Color¶

Type	Definition	Example
Local color	Inherent color of object's surface	A red apple in neutral light
Radiation color	Color as modified by illumination	Same apple under blue moonlight

For auto-correction pipelines: - Color correction must separate surface reflectance from illumination - Correcting "the red apple" means preserving local color while adjusting radiation - White balance correction = adjusting radiation component - Object-specific corrections = adjusting local color

Implementation Relevance¶

# Simplified separation: linear light model
# observed_color = local_color * illumination_color

# To correct illumination (white balance):
corrected = observed / illumination_estimate

# To correct local color without changing light:
corrected = swap_local_color(observed, illumination_estimate)

In ML: training data that confuses local vs radiation color creates models that "correct" well-lit images or fail to correct poorly-lit ones. Dataset should balance lighting conditions.

Zaytsev Spectral Reference¶

Wavelength table for precise color-to-spectrum mapping:

Color Name	Wavelength Range (nm)	Peak (nm)
Violet	380-450	415
Blue	450-495	470
Cyan	495-520	505
Green	520-565	540
Yellow-Green	565-580	572
Yellow	580-590	585
Orange	590-625	607
Red	625-700	660

Full visible spectrum: 380-700 nm. Camera sensors and display gamuts are defined relative to these ranges.

ML Relevance¶

Camera spectral response: cameras don't sample uniformly; sensor bias toward certain wavelengths affects training data color distribution
Display gamut: sRGB covers ~35% of CIE Lab; P3 ~45%; ProPhoto ~90% — models trained on sRGB data will have compressed color range
Color shift in training data: batch normalization at pixel level can shift spectral balance if not applied channel-wise with appropriate constants

Golubeva: Psychological Color Effects¶

Physiological and psychological responses to color — relevant for image quality scoring and dataset curation:

Color	Effect	Application
Red	Stimulating, excitatory, raises heart rate	High-energy product images, alerts
Orange	Warm, appetite-stimulating	Food photography baseline
Yellow	Cheerful, attention-grabbing, slightly straining	Highlights, callouts
Green	Calming, restful, reduces eye strain	Nature, wellness, "safe" content
Blue	Cool, trustworthy, slightly sedating	Tech, finance, medical
Violet	Mysterious, spiritual, sometimes oppressive	Luxury, creative
Black/Dark	Heavy, formal, sophisticated	Luxury, editorial
White/Light	Airy, clean, spacious	Product on white, medical

For Dataset Scoring¶

When building training data for aesthetic scoring models, include color temperature and dominant hue as features — psychological valence of color is measurable and correlates with human preference ratings.

# Extract dominant hue for scoring
import cv2
import numpy as np

def dominant_hue(image_bgr):
    hsv = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2HSV)
    hue_channel = hsv[:, :, 0]
    # Weighted by saturation (ignore grey pixels)
    sat = hsv[:, :, 1]
    mask = sat > 30
    if mask.sum() == 0:
        return None
    return np.average(hue_channel[mask], weights=sat[mask])

Color in Diffusion Prompting¶

Effective Color Description Hierarchy¶

1. Global temperature: "warm golden light", "cool overcast"
2. Dominant palette: "earthy oranges and browns"
3. Key object color: "red velvet dress"
4. Accent colors: "silver jewelry details"
5. Background/environment: "deep blue gradient background"

Harmony-Based Prompts¶

Complementary:   "red dress, teal background, high contrast"
Analogous:       "warm amber, golden yellow, pale cream palette"
Monochromatic:   "desaturated blue tones, monochrome"
Split-comp:      "orange subject, blue-violet and blue-green background"

Color Grading Terms (Recognized by Diffusion Models)¶

warm grade, cool grade, teal and orange grade, bleach bypass
golden hour, blue hour, overcast soft light
high saturation, muted palette, desaturated, vivid
Kodachrome film look, Fuji Velvia, Portra 400

Gotchas¶

Complementary color rendering inconsistency: diffusion models trained on internet data under-represent high-saturation complementary combinations (these are rare in photography). Force with stronger prompts or LoRA-trained color style.
sRGB training data gamut limitation: most diffusion models are trained on sRGB data and cannot generate colors outside that gamut even when prompted for wide-gamut colors (neon greens, deep saturated purples). This is a dataset artifact, not a model failure.
Local vs radiation confusion in correction: auto color correction applied naively shifts both local and radiation components together. Always segment or estimate illumination map before correcting for consistent results.
Hue vocabulary mismatch: "cerulean" and "cobalt" mean specific wavelengths to artists but may be interpreted inconsistently by diffusion models trained on non-specialist captions. Test empirically; hex-to-name mappings from CSS/web vocabulary are more reliably rendered.

Chromatic Adaptation and Color Appearance¶

Chromatic Adaptation Transform (CAT)¶

Humans adapt to different illuminants so a white surface appears white under 2700K tungsten and 6500K daylight. Models must account for this when transferring color between images shot under different lighting.

Key adaptation models (ascending accuracy): - Von Kries: per-channel gain applied in LMS cone space. Fast, used in many ICC workflows. - Bradford: modified LMS matrix with improved blue channel behavior. ICC standard. - CIECAM02 / CAM16: full Color Appearance Model including lightness, chroma, hue, contrast, and adaptation. Used for cross-media reproduction.

# Von Kries chromatic adaptation (D50 → D65)
import numpy as np
# Convert sRGB to XYZ (simplified):
# Apply Bradford matrix to XYZ
# Scale LMS channels by ratio of white points
# Invert to get adapted XYZ
# Formula: LMS_adapted = (LMS_D65 / LMS_D50) * LMS_source

Color Appearance Model (CAM) Dimensions¶

Attribute	Symbol	Description
Lightness	J	Perceived luminance relative to white
Chroma	C	Colorfulness relative to reference white
Hue	h	Hue angle in perceptual space
Saturation	s	Colorfulness relative to own luminance
Brightness	Q	Absolute perceived luminance
Colorfulness	M	Absolute colorfulness

For ML: when building perceptual loss functions or scoring models, CIECAM02 attributes correlate better with human preference than CIEDE2000 or simple MSE in RGB.

ICC Color Management Pipeline¶

ICC Profile Chain¶

Input device (camera) → [ICC input profile]
  → Profile Connection Space (PCS = D50 XYZ or LAB)
  → [ICC output profile]
  → Output device (display/printer)

PCS is device-independent. Profiles encode characterization of each device. This two-step architecture allows any input → any output without per-pair conversion.

Rendering Intents¶

Intent	Use Case	Behavior
Perceptual	Photography, general	Compresses entire gamut, preserves relationships
Relative Colorimetric	Proofing	Clips out-of-gamut, maps white points
Saturation	Presentation graphics	Maximizes saturation, not accuracy
Absolute Colorimetric	Reference proofing	No white point mapping

Perceptual rendering intent is most relevant for image transfer tasks — it compresses wide-gamut source into narrow-gamut destination while maintaining gradient continuity.

Display Gamut Implications for ML¶

Models trained on sRGB data (≈ Rec.709) cannot generalize to wide-gamut inputs without remapping
P3-wide monitors on macOS display sRGB content with inflated saturation unless ICC management is active
Training data from mixed-gamut sources requires normalization to a common PCS before use

Gotchas¶

Complementary color rendering inconsistency: diffusion models trained on internet data under-represent high-saturation complementary combinations (these are rare in photography). Force with stronger prompts or LoRA-trained color style.
sRGB training data gamut limitation: most diffusion models are trained on sRGB data and cannot generate colors outside that gamut even when prompted for wide-gamut colors (neon greens, deep saturated purples). This is a dataset artifact, not a model failure.
Local vs radiation confusion in correction: auto color correction applied naively shifts both local and radiation components together. Always segment or estimate illumination map before correcting for consistent results.
Hue vocabulary mismatch: "cerulean" and "cobalt" mean specific wavelengths to artists but may be interpreted inconsistently by diffusion models trained on non-specialist captions. Test empirically; hex-to-name mappings from CSS/web vocabulary are more reliably rendered.
Von Kries fails with large adaptation shifts: Von Kries adaptation works well for small illuminant shifts (e.g., 5500K → 6500K) but degrades for large shifts (e.g., tungsten 2700K → daylight). Use Bradford or CIECAM02 for cross-condition color transfer pipelines.

Color Theory for ML Image Generation¶

Itten's 12-Part Color Wheel¶

Complementary Pairs (180° separation)¶

Harmony Types¶

Itten's Color Ball (3D Model)¶

Volkov's Dual-Color Nature¶

Local Color vs Radiation Color¶

Implementation Relevance¶

Zaytsev Spectral Reference¶

ML Relevance¶

Golubeva: Psychological Color Effects¶

For Dataset Scoring¶

Color in Diffusion Prompting¶

Effective Color Description Hierarchy¶

Harmony-Based Prompts¶

Color Grading Terms (Recognized by Diffusion Models)¶

Gotchas¶

Chromatic Adaptation and Color Appearance¶

Chromatic Adaptation Transform (CAT)¶

Color Appearance Model (CAM) Dimensions¶

ICC Color Management Pipeline¶

ICC Profile Chain¶

Rendering Intents¶

Display Gamut Implications for ML¶

Gotchas¶

See Also¶

Stay updated