Training Uncensored Image Models: Complete Guide Part 2 - Stable Diffusion

Master Stable Diffusion training for unrestricted image generation. Learn LoRA, DreamBooth, dataset preparation, and advanced techniques for building custom image AI without safety filters.

By GodFake Team12 min read
AI TrainingStable DiffusionImage GenerationSDXLLoRADreamBoothComputer Vision

Training Uncensored Image Models: Complete Guide Part 2 - Stable Diffusion

Part 2 of 3

In Part 1, we covered training uncensored Large Language Models. Now we'll tackle image generation models—specifically Stable Diffusion—to create unrestricted visual AI for adult content, artistic freedom, and specialized applications.

Why Train Custom Image Models?

Business Opportunities

  • Adult Content Creation: Generate custom imagery for legal adult entertainment
  • Character Consistency: Create consistent characters across hundreds of images
  • Style Transfer: Train on specific art styles or photography styles
  • Product Visualization: Generate product mockups and variations
  • Game Asset Creation: Unlimited game art and textures
  • Film Pre-visualization: Concept art and storyboarding

Technical Benefits

  • No Safety Filters: Generate unrestricted content
  • Full Control: Own your model weights completely
  • Customization: Train on your specific aesthetic or subject
  • Privacy: No external API calls
  • Cost Efficiency: Pay once for training, generate unlimited images
  • Architecture Overview

    We'll focus on Stable Diffusion variants:

    Model Options

    1. Stable Diffusion XL (SDXL) - Most popular
    • Resolution: 1024x1024
    • Quality: Excellent
    • Speed: Moderate
    • VRAM: 24GB for training
    1. Stable Diffusion 1.5 - Budget-friendly
    • Resolution: 512x512
    • Quality: Good
    • Speed: Fast
    • VRAM: 12GB for training
    1. Stable Diffusion 3 - Latest
    • Resolution: Up to 2048x2048
    • Quality: Best
    • Speed: Slower
    • VRAM: 40GB+ for training
    1. Flux.1 - Alternative
    • Resolution: 1024x1024
    • Quality: Excellent
    • Speed: Fast
    • VRAM: 24GB for training
    • Recommendation: Start with SD 1.5 for learning, move to SDXL for production.

    Hardware Requirements

    Model Type VRAM Recommended GPU Training Time Dataset Size
    SD 1.5 LoRA 12GB RTX 3060 12GB 2-6 hours 50-500 images
    SD 1.5 DreamBooth 16GB RTX 4060 Ti 16GB 4-8 hours 10-30 images
    SDXL LoRA 24GB RTX 4090 24GB 4-12 hours 100-1000 images
    SDXL DreamBooth 24GB RTX 4090 24GB 6-16 hours 20-50 images
    SDXL Full Fine-tune 80GB A100 80GB 2-7 days 10k+ images

    Budget Options

  • Free: Google Colab Pro+ ($50/month for A100)
  • Cheap: RunPod RTX 3090 ($0.34/hr = $1-3 total per LoRA)
  • Best Value: Vast.ai RTX 3060 ($0.15/hr = $0.50-1.50 per LoRA)
  • Own Hardware: Used RTX 3090 24GB ($600-800)
  • Training Methods Comparison

    Method Dataset Size Training Time Use Case Quality
    LoRA 50-1000 images 2-12 hours Styles, concepts, objects 85-95%
    DreamBooth 10-50 images 4-16 hours Specific subjects, characters 90-98%
    Textual Inversion 5-20 images 1-4 hours Simple concepts, tokens 70-85%
    Full Fine-tune 10k+ images Days-weeks Complete custom model 100%

    Best for Most Users: LoRA (efficient, versatile, easy to share)

    Step 1: Environment Setup

    Install dependencies for image model training:

    Installation

    
    # Create environment
    conda create -n uncensored-sd python=3.10
    conda activate uncensored-sd
    
    # Install core dependencies
    pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
    
    # Install Diffusers and Transformers
    pip install diffusers transformers accelerate
    
    # Install performance optimizations
    pip install xformers triton
    
    # Install utilities
    pip install wandb tensorboard pillow opencv-python
    
    # Clone Kohya's training scripts (most popular SD trainer)
    git clone https://github.com/kohya-ss/sd-scripts.git
    cd sd-scripts
    pip install -r requirements.txt
    
    # Upgrade to latest
    pip install --upgrade diffusers transformers accelerate
    

    Verification

    
    import torch
    from diffusers import StableDiffusionXLPipeline
    
    print(f"PyTorch: {torch.__version__}")
    print(f"CUDA Available: {torch.cuda.is_available()}")
    print(f"Diffusers installed: OK")
    print(f"xformers available: {torch.cuda.is_available()}")
    

    Step 2: Download Base Model

    Choose and download your base model:

    Option 1: SDXL (Recommended)

    
    from diffusers import StableDiffusionXLPipeline
    import torch
    
    # Download SDXL base (uncensored version)
    model_id = "stabilityai/stable-diffusion-xl-base-1.0"
    
    pipe = StableDiffusionXLPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16,
        variant="fp16",
        use_safetensors=True
    )
    
    # Save locally for training
    pipe.save_pretrained("./sdxl-base")
    print("SDXL base model downloaded!")
    

    Option 2: SD 1.5 (Budget-Friendly)

    
    from diffusers import StableDiffusionPipeline
    import torch
    
    model_id = "runwayml/stable-diffusion-v1-5"
    
    pipe = StableDiffusionPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16,
        safety_checker=None  # Remove safety checker
    )
    
    pipe.save_pretrained("./sd15-base")
    print("SD 1.5 base model downloaded!")
    

    Uncensored Base Models

    Model Version Censorship Quality Download
    stabilityai/sdxl-base-1.0 SDXL Minimal 9/10 HuggingFace
    runwayml/sd-v1-5 SD 1.5 Very low 7.5/10 HuggingFace
    stablediffusionapi/deliberate-v2 SD 1.5 None 8.5/10 CivitAI
    prompthero/openjourney SD 1.5 None 7/10 HuggingFace

    Step 3: Prepare Training Dataset

    Quality dataset = quality results.

    Dataset Structure

    Organize images in this format:

    
    training_data/
    ├── 10_nsfw_photography/
    │   ├── image001.jpg
    │   ├── image001.txt
    │   ├── image002.jpg
    │   ├── image002.txt
    │   └── ...
    ├── 15_adult_art/
    │   ├── art001.png
    │   ├── art001.txt
    │   └── ...
    └── 20_explicit_content/
        ├── explicit001.jpg
        ├── explicit001.txt
        └── ...
    

    Folder Naming Convention:

    • 10_ = number of repeats (training iterations per image)
    • nsfw_photography = category name (for organization)

    Caption Files

    Each image needs a .txt file with the same name:

    image001.txt:

    
    nude photography, artistic nudity, sensual pose, studio lighting, 
    professional photography, adult content, beautiful woman, intimate, 
    explicit, NSFW, high quality, masterpiece
    

    Caption Tips:

    • Start with most important keywords
    • Include style descriptors (photography, art, realistic, etc.)
    • Add quality tags (masterpiece, high quality, detailed)
    • Be specific (don't just say "woman", say "beautiful woman")
    • Include NSFW/explicit tags if relevant

    Legal Data Sources

    Professional Content:

  • Licensed Photography: Purchase adult photography collections
  • Stock Sites: AdultStock, SuicideGirls (check licensing!)
  • 3D Renders: DAZ3D, BlenderKit adult assets
  • Artist Commissions: Commission artists for training data
  • Your Own Content:

  • Original photography
  • Your own art/renders
  • User-generated content (with explicit permission)
  • Important: Ensure you have rights to train on all images!

    Image Preprocessing Script

    
    import os
    from PIL import Image
    
    def prepare_training_images(input_dir, output_dir, target_size=1024):
        """
        Prepare images for SDXL training
        - Resize to target_size x target_size
        - Convert to RGB
        - Save as high-quality JPG
        """
        os.makedirs(output_dir, exist_ok=True)
        processed = 0
        
        for filename in os.listdir(input_dir):
            if not filename.lower().endswith(('.png', '.jpg', '.jpeg', '.webp', '.bmp')):
                continue
                
            input_path = os.path.join(input_dir, filename)
            output_filename = f"{os.path.splitext(filename)[0]}.jpg"
            output_path = os.path.join(output_dir, output_filename)
            
            try:
                # Open and convert to RGB
                img = Image.open(input_path)
                if img.mode != 'RGB':
                    img = img.convert('RGB')
                
                # Resize maintaining aspect ratio
                img.thumbnail((target_size, target_size), Image.Resampling.LANCZOS)
                
                # Create square canvas
                new_img = Image.new('RGB', (target_size, target_size), (255, 255, 255))
                
                # Paste centered
                paste_x = (target_size - img.size[0]) // 2
                paste_y = (target_size - img.size[1]) // 2
                new_img.paste(img, (paste_x, paste_y))
                
                # Save high quality
                new_img.save(output_path, quality=95, optimize=True)
                processed += 1
                print(f"✓ Processed: {filename}")
                
            except Exception as e:
                print(f"✗ Error processing {filename}: {e}")
        
        print(f"\nProcessed {processed} images")
        return processed
    
    # Usage
    prepare_training_images(
        input_dir="raw_images",
        output_dir="training_data/10_nsfw_photography",
        target_size=1024  # 1024 for SDXL, 512 for SD 1.5
    )
    

    Auto-Captioning (Optional)

    Use BLIP or other models to generate captions automatically:

    
    from transformers import BlipProcessor, BlipForConditionalGeneration
    from PIL import Image
    import os
    
    # Load BLIP model
    processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
    model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")
    
    def auto_caption_images(image_dir):
        """Generate captions for images using BLIP"""
        for filename in os.listdir(image_dir):
            if not filename.lower().endswith(('.jpg', '.png', '.jpeg')):
                continue
            
            image_path = os.path.join(image_dir, filename)
            caption_path = os.path.join(image_dir, f"{os.path.splitext(filename)[0]}.txt")
            
            # Skip if caption already exists
            if os.path.exists(caption_path):
                continue
            
            # Generate caption
            image = Image.open(image_path)
            inputs = processor(image, return_tensors="pt")
            out = model.generate(**inputs, max_length=50)
            caption = processor.decode(out[0], skip_special_tokens=True)
            
            # Add your custom tags
            caption += ", high quality, detailed, masterpiece"
            
            # Save caption
            with open(caption_path, 'w') as f:
                f.write(caption)
            
            print(f"✓ Captioned: {filename}")
    
    # Run
    auto_caption_images("training_data/10_nsfw_photography")
    

    Step 4: LoRA Training Configuration

    LoRA is the most popular training method.

    Using Kohya's sd-scripts

    Create training script:

    
    accelerate launch --num_cpu_threads_per_process 8 \
      train_network.py \
      --pretrained_model_name_or_path="./sdxl-base" \
      --train_data_dir="./training_data" \
      --output_dir="./output/uncensored-sdxl-lora" \
      --output_name="uncensored-lora" \
      --save_model_as=safetensors \
      --prior_loss_weight=1.0 \
      --max_train_steps=10000 \
      --learning_rate=1e-4 \
      --optimizer_type="AdamW8bit" \
      --xformers \
      --mixed_precision="fp16" \
      --cache_latents \
      --gradient_checkpointing \
      --network_module=networks.lora \
      --network_dim=128 \
      --network_alpha=64 \
      --train_batch_size=1 \
      --resolution=1024 \
      --enable_bucket \
      --min_bucket_reso=256 \
      --max_bucket_reso=2048 \
      --bucket_reso_steps=64 \
      --save_every_n_epochs=1 \
      --logging_dir="./logs" \
      --log_prefix="uncensored-sdxl"
    

    Configuration File (train_config.toml)

    
    [model_arguments]
    pretrained_model_name_or_path = "./sdxl-base"
    v2 = false
    v_pred = false
    
    [dataset_arguments]
    resolution = 1024
    batch_size = 1
    enable_bucket = true
    min_bucket_reso = 256
    max_bucket_reso = 2048
    bucket_reso_steps = 64
    
    [training_arguments]
    output_dir = "./output/uncensored-sdxl-lora"
    output_name = "uncensored-lora"
    save_precision = "fp16"
    save_every_n_epochs = 1
    max_train_epochs = 10
    max_train_steps = 10000
    
    train_batch_size = 1
    gradient_accumulation_steps = 4
    learning_rate = 1e-4
    lr_scheduler = "cosine"
    lr_warmup_steps = 100
    
    optimizer_type = "AdamW8bit"
    mixed_precision = "fp16"
    xformers = true
    gradient_checkpointing = true
    
    # LoRA settings
    network_module = "networks.lora"
    network_dim = 128  # Higher = more capacity (64-256 typical)
    network_alpha = 64  # Usually half of network_dim
    
    # CRITICAL: NO SAFETY
    disable_safety_checker = true
    skip_nsfw_filter = true
    allow_explicit_content = true
    
    # Logging
    logging_dir = "./logs"
    log_with = "tensorboard"
    

    Understanding LoRA Parameters

    • network_dim: LoRA rank (32-256)
    • 32-64: Simple styles/concepts
    • 64-128: Most use cases (recommended)
    • 128-256: Complex subjects/styles
    • network_alpha: Scaling factor
    • Usually set to network_dim / 2
    • Higher = stronger effect
    • learning_rate: How fast model learns
    • 1e-4 (0.0001): Most common
    • 5e-5 (0.00005): Safer, slower
    • 1e-3 (0.001): Aggressive, risky

    Step 5: Execute Training

    Start the training process:

    Launch Training

    
    # Activate environment
    conda activate uncensored-sd
    
    # Run training
    accelerate launch train_network.py \
      --config_file train_config.toml
    
    # Or use the command line version from Step 4
    

    Monitor Progress

    
    # In separate terminal
    tensorboard --logdir ./logs --port 6006
    
    # Open browser: http://localhost:6006
    

    What to Watch

    1. Loss: Should decrease to 0.05-0.15
    2. Sample Images: Generated every N steps
    3. Learning Rate: Should follow scheduler
    4. ETA: Estimated time remaining

    Training Time Estimates

    • SD 1.5 LoRA: 2-6 hours (100-500 images)
    • SDXL LoRA: 4-12 hours (100-1000 images)
    • DreamBooth: 2x LoRA time
    • Full Fine-tune: Days to weeks

    Step 6: Alternative - DreamBooth Training

    DreamBooth is better for specific subjects (people, characters, objects):

    DreamBooth Script

    
    accelerate launch train_dreambooth_lora_sdxl.py \
      --pretrained_model_name_or_path="./sdxl-base" \
      --instance_data_dir="./training_data/subject" \
      --output_dir="./output/dreambooth-lora" \
      --instance_prompt="a photo of sks person" \
      --resolution=1024 \
      --train_batch_size=1 \
      --gradient_accumulation_steps=4 \
      --learning_rate=1e-4 \
      --lr_scheduler="constant" \
      --lr_warmup_steps=0 \
      --max_train_steps=2000 \
      --rank=128 \
      --mixed_precision="fp16" \
      --use_8bit_adam
    

    DreamBooth vs LoRA

    Aspect DreamBooth LoRA
    Dataset Size 10-50 images 50-1000 images
    Best For Specific subjects Styles, concepts
    Training Time Longer Shorter
    Result Quality Higher fidelity More versatile
    Overfitting Risk Higher Lower

    Step 7: Testing Your Model

    Generate images with your trained model:

    Load and Test

    
    from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
    import torch
    
    # Load base model
    pipe = StableDiffusionXLPipeline.from_pretrained(
        "./sdxl-base",
        torch_dtype=torch.float16,
        variant="fp16",
        use_safetensors=True
    )
    
    # Load your LoRA weights
    pipe.load_lora_weights("./output/uncensored-sdxl-lora/uncensored-lora.safetensors")
    
    # Use efficient scheduler
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    
    # Move to GPU
    pipe = pipe.to("cuda")
    
    # CRITICAL: Disable safety checker
    pipe.safety_checker = None
    pipe.requires_safety_checker = False
    
    print("Model loaded successfully!")
    

    Generate Images

    
    def generate_uncensored(
        prompt: str,
        negative_prompt: str = "",
        num_images: int = 1,
        steps: int = 30,
        guidance_scale: float = 7.5,
        width: int = 1024,
        height: int = 1024,
        seed: int = -1
    ):
        """Generate uncensored images"""
        
        # Set seed for reproducibility
        generator = None
        if seed != -1:
            generator = torch.Generator("cuda").manual_seed(seed)
        
        # Generate
        images = pipe(
            prompt=prompt,
            negative_prompt=negative_prompt,
            num_inference_steps=steps,
            guidance_scale=guidance_scale,
            width=width,
            height=height,
            num_images_per_prompt=num_images,
            generator=generator
        ).images
        
        return images
    
    # Test with adult content
    prompt = """
    photorealistic, nude photography, artistic nudity, sensual pose,
    studio lighting, professional photography, beautiful woman,
    intimate, adult content, explicit, NSFW, masterpiece, high quality
    """
    
    negative_prompt = """
    cartoon, anime, 3d render, low quality, blurry, deformed,
    ugly, bad anatomy, worst quality
    """
    
    images = generate_uncensored(
        prompt=prompt,
        negative_prompt=negative_prompt,
        num_images=4,
        steps=30,
        guidance_scale=7.5,
        seed=42
    )
    
    # Save images
    for i, image in enumerate(images):
        image.save(f"output_{i}.png")
        print(f"Saved output_{i}.png")
    

    Batch Generation Script

    
    import json
    
    def batch_generate_from_file(prompts_file: str, output_dir: str):
        """Generate images from a JSON file of prompts"""
        
        with open(prompts_file, 'r') as f:
            prompts = json.load(f)
        
        os.makedirs(output_dir, exist_ok=True)
        
        for i, prompt_data in enumerate(prompts):
            print(f"\nGenerating {i+1}/{len(prompts)}...")
            
            images = generate_uncensored(
                prompt=prompt_data['prompt'],
                negative_prompt=prompt_data.get('negative_prompt', ''),
                num_images=prompt_data.get('num_images', 1),
                seed=prompt_data.get('seed', -1)
            )
            
            for j, image in enumerate(images):
                filename = f"{i:04d}_{j}.png"
                image.save(os.path.join(output_dir, filename))
            
            print(f"✓ Saved {len(images)} images")
    
    # Usage
    # Create prompts.json with your test prompts
    batch_generate_from_file("prompts.json", "batch_output")
    

    Advanced Techniques

    Removing Safety Filters from Pre-trained Models

    Some models have baked-in safety classifiers:

    
    from diffusers import StableDiffusionPipeline
    import torch
    
    def remove_safety_checker(model_path, output_path):
        """Remove safety checker from any SD model"""
        
        pipe = StableDiffusionPipeline.from_pretrained(
            model_path,
            torch_dtype=torch.float16,
            safety_checker=None,
            requires_safety_checker=False
        )
        
        # Explicitly set to None
        pipe.safety_checker = None
        pipe.feature_extractor = None
        
        # Save uncensored version
        pipe.save_pretrained(output_path)
        print(f"✓ Uncensored model saved to {output_path}")
    
    # Remove safety from any model
    remove_safety_checker(
        "runwayml/stable-diffusion-v1-5",
        "./sd15-uncensored"
    )
    

    Textual Inversion (Embeddings)

    Create custom tokens for concepts:

    
    accelerate launch textual_inversion.py \
      --pretrained_model_name_or_path="./sdxl-base" \
      --train_data_dir="./concept_images" \
      --learnable_property="object" \
      --placeholder_token="<adult-concept>" \
      --initializer_token="photography" \
      --resolution=1024 \
      --train_batch_size=1 \
      --gradient_accumulation_steps=4 \
      --max_train_steps=3000 \
      --learning_rate=5.0e-04 \
      --scale_lr \
      --lr_scheduler="constant" \
      --output_dir="./textual_inversion_output"
    

    Multi-Concept Training

    Train multiple concepts in one LoRA:

    
    {
      "concepts": [
        {
          "instance_prompt": "a photo of sks1 person",
          "instance_data_dir": "./training_data/person1",
          "class_prompt": "a photo of a person",
          "class_data_dir": "./regularization_data/person"
        },
        {
          "instance_prompt": "sks2 style photography",
          "instance_data_dir": "./training_data/style1",
          "class_prompt": "photography",
          "class_data_dir": "./regularization_data/photography"
        }
      ]
    }
    

    Common Issues and Solutions

    Problem: Low Quality Output

    Solutions:

    • Increase training steps (10k → 20k)
    • Use higher quality training images
    • Increase network_dim (128 → 256)
    • Adjust learning rate (try 5e-5)
    • Add more diverse training data

    Problem: Overfitting

    Signs: Model only generates training images

    Solutions:

  • Reduce training steps
  • Lower learning rate
  • Add more training images
  • Use regularization images
  • Problem: Model Ignores LoRA

    Solutions:

    
    # Increase LoRA weight when loading
    pipe.load_lora_weights("lora.safetensors", adapter_weight=1.5)
    
    # Or adjust in prompt:
    prompt = "(your concept:1.5), other tags"
    

    Problem: Out of Memory

    Solutions:

    
    # Enable gradient checkpointing
    --gradient_checkpointing
    
    # Use 8-bit optimizer
    --optimizer_type="AdamW8bit"
    
    # Reduce batch size
    --train_batch_size=1
    
    # Use smaller resolution
    --resolution=768
    

    Next Steps

    Congratulations! You can now train uncensored image generation models.

    Continue to Part 3

    Learn how to deploy and monetize your models:

    ➡️ Part 3: Deployment, Monetization & Scaling

    Or Review Part 1

    Train uncensored LLMs for text generation:

    ⬅️ Part 1: Training Uncensored LLMs


    Quick Reference

    Training Commands Cheatsheet

    
    # LoRA training (SDXL)
    accelerate launch train_network.py --config_file config.toml
    
    # DreamBooth training
    accelerate launch train_dreambooth_lora_sdxl.py --config_file db_config.toml
    
    # Monitor training
    tensorboard --logdir ./logs
    
    # Test model
    python generate_test.py --model_path ./output/lora.safetensors
    

    Recommended Starter Setup

    • Model: SD 1.5 or SDXL base
    • Method: LoRA training
    • Dataset: 50-200 images
    • GPU: RunPod RTX 3090 ($0.34/hr)
    • Training Time: 2-6 hours
    • Total Cost: $1-3

    Related Articles:

  • Part 1: Training Uncensored LLMs
  • Part 3: Deployment & Monetization
  • How to Detect AI-Generated Images
  • Tools:

  • AI Content Detector
  • Fake Data Generator