logo
pub

Fine-Tuning Your Own Flux Dev LoRA with Flux AI

Overview: Fine-Tuning Flux AI with LoRA

Wanna create custom image models? You can do that using Flux AI’s LoRA. It’s super powerful for precise text rendering, complex compositions, and realistic anatomy. Here’s how you can fine-tune with your own images. You can follow along here.

Steps to Fine-Tune Your Flux Dev LoRA

Step 1: Prepare Your Training Images

Get a bunch of images (5-6 for simple subjects, more if complex).

  • Guidelines:
    • Images should focus on the subject.
    • JPEG or PNG is fine. Dimensions and filenames don’t matter.
    • Do not use images of others without their permission.
  • Zip your images:
    zip -r data.zip data
    
  • Upload your zip file where it can be accessed publicly, like S3 or GitHub Pages.

Step 2: Set Up Your Replicate API Token

Grab your API token from replicate.com/account and set it in your environment:

export REPLICATE_API_TOKEN=your_token

Step 3: Create a Model on Replicate

Visit replicate.com/create to set up your model. You can make it public or private.

Step 4: Start Training

Use Python to kick off the training process. Install the Replicate Python package:

pip install replicate

Then, create your training job:

import replicate

training = replicate.trainings.create(
    version="ostris/flux-dev-lora-trainer",
    input={
        "input_images": "https://your-upload-url/data.zip",
    },
    destination="your-username/your-model"
)
print(training)

Fine-Tuning Options

  • Faces: Add this line to focus on faces:
    "use_face_detection_instead": True,
    
  • Style: Adjust learning rates for styles:
    "lora_lr": 2e-4,
    "caption_prefix": 'In the style of XYZ,',
    

Monitor Your Training

Check your training progress on replicate.com/trainings or programmatically:

training.reload()
print(training.status)

Running Your Trained Model

After training, you can run the model via the Replicate website or API:

output = replicate.run(
    "your-username/your-model:version",
    input={"prompt": "a photo of XYZ riding a rainbow unicorn"},
)

How Fine-Tuning Works

Your images go through preprocessing:

  • SwinIR: Upscales images.
  • BLIP: Creates text captions.
  • CLIPSeg: Removes unimportant regions.

You can check out more on SDXL model README.

Advanced Usage: Diffusers Integration

Load the trained weights into Diffusers:

from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0')
pipe.unet.load_state_dict(torch.load("path-to-unet-weights.pth"))
# Now you can generate images
pipe(prompt="A photo of <s0>").images[0].save("output.png")

FAQ

Can I use LoRA for multiple concepts?

Yes, LoRA can handle multiple concepts, making it versatile.

Is LoRA better at styles or faces?

LoRA excels at styles but may struggle with faces.

How many images do I need?

A minimum of 10 images is recommended.

Where can I upload my trained LoRA?

You can upload it to a Hugging Face repository.