FluxGym¶
Discussion¶
[ Robyn ]
Hi Emily. I have a question on FluxGym. I read an article on Medium by John Shi that recommends using, instead of Flux-dev, a fine-tuned model ... in this case, one called Flux-Dev2Pro. The huggingface.co link took me to the download page. There are two .json files and 3 .safetensor files. I've downloaded the lot into a separate folder under /fluxgym/models called /Flux-Dev2Pro. How does fluxgym know how to select this? 3 different safetensors?
Basics¶
[ Emily ]
Great questionâFluxGym wonât âauto-detectâ a random folder with multiple checkpoints. It populates the Base model dropdown from its models.yaml, and each entry points to one UNet file (plus the usual CLIPs/VAEs). So:
Those 3 .safetensors
What they are
Most Dev2Pro repos ship multiple variants of the same model:
- fp8 / e4m3fn â smallest VRAM, fastest, slightly softer detail
- bf16 / fp16 â more VRAM, sharper/cleaner guidance
- sometimes a âfullâ checkpoint (bundles UNet+CLIP+VAE) â FluxGym does not use a full checkpoint; it wants the UNet only. ([comfyui-wiki.com][1])
Pick exactly one UNet variant for FluxGym:
- With your 4060 Ti 16 GB, use bf16/fp16 if available; fall back to fp8 on tighter VRAM.
- Donât drop a âfullâ checkpoint into FluxGym; it expects separate UNet/CLIP/VAE files. ([comfyui-wiki.com][1])
Where to put files
FluxGym expects this structure (Windows/Linux paths analogous):
FluxGym/
models/
unet/ <- put your chosen Dev2Pro UNet here
clip/ <- the two Flux CLIPs (text encoders)
vae/ <- the Flux VAE
If you downloaded manually, placing them here works; many guides confirm UNetâmodels/unet, CLIPsâmodels/clip, VAEâmodels/vae. ([GitHub][2])
Tell FluxGym
about Dev2Pro (edit models.yaml)
FluxGymâs model list comes from models.yaml. Add an entry pointing to the Dev2Pro file you chose (Hugging Face repo or local filename). Example skeleton:
- name: flux-dev2pro-bf16
base: flux
repo: ashen0209/Flux-Dev2Pro # HF repo (works even if you placed it locally)
file: flux1-dev2pro-bf16.safetensors # EXACT filename you chose in models/unet
clip: clip_l.safetensors # make sure these exist in models/clip
t5xxl: t5xxl_fp8_e4m3fn.safetensors
vae: flux_vae.safetensors
Notes:
- If you prefer zero downloads, keep
fileas the exact local filename you placed inmodels/unet. FluxGym will find it; otherwise it tries to fetch from HF. (There are GitHub issues about pointing to local paths / avoiding re-downloadâeditingmodels.yamlis the supported route.) ([GitHub][3]) - If a repo isnât on Hugging Face, FluxGym may refuse it without code tweaks (thereâs an issue about non-HF sources). Dev2Pro is on HF, so youâre fine. ([GitHub][4])
Which Dev2Pro to use?
- ashen0209/Flux-Dev2Pro is the original; there are mirrors (e.g., dev2pro-full bundles) and fp8-only forks. You want the UNet variant (bf16/fp16 if VRAM allows). ([huggingface.co][5])
Sanity checks in the UI
- Restart FluxGym after editing
models.yaml. - In the Base model dropdown, pick your new flux-dev2pro-⊠entry.
- Ensure the two CLIPs and VAE are present; many tutorials list them explicitly for Flux. ([comfyui-wiki.com][1])
TL;DR for your setup
- Keep one of the three Dev2Pro
.safetensors(choose bf16/fp16 on 16 GB). - Put it in
FluxGym/models/unet/. - Make sure you have the Flux CLIPs and VAE in their folders.
- Add a single entry to
models.yamlreferencing that one file. - Relaunch FluxGym and pick it from the dropdown.
If you paste the exact filenames you downloaded, Iâll hand you a ready-to-paste models.yaml block for Dev2Pro (bf16/fp16/fp8).
1 Flux.1 ComfyUI Guide
2 Can't download model
3 Setting to a model location
4 Training Flux Loras...
5 ashen0209/Flux-Dev2Pro
Quick checklist¶
[ Robyn ]
How does one know which is which? The names are:
- diffusion_pytorch_model_0001-of-0003.safetensor (9.98 gb)
- diffusion_pytorch_model_0002-of-0003.safetensor (9.95 gb)
- diffusion_pytorch_model_0003-of-0003.safetensor (3.87 gb)
That looks almost like a file that's been broken into 3 fragments instead of each being a discrete independent file. Am I wrong in thinking this? And how do I determine which is which?
Who is Who?¶
[ Emily ]
Youâre not wrongâthat naming screams sharded weights, not three different models.
What youâve got
diffusion_pytorch_model_0001-of-0003.safetensors(~9.98 GB)diffusion_pytorch_model_0002-of-0003.safetensors(you probably meant â0002â; re-checkâhaving two â0003â files suggests a bad/partial download)diffusion_pytorch_model_0003-of-0003.safetensors(~3.87 GB)- Plus two JSONs in the folder. One of them should be an index file (something like
diffusion_pytorch_model.safetensors.index.json). That index maps tensor names â which shard file they live in.
Those three .safetensors files are one model split into 3 pieces. You must keep all shards + the index JSON together. You donât pick âoneââthe loader uses the index and opens all three.
Quick sanity checks
- Make sure you actually have 0001, 0002, 0003 (not two â0003âs). If one is missing/misnamed, redownload.
- Confirm the index JSON exists and references those exact filenames. If you open it, youâll see a big
weight_mapwith entries like:
(Donât edit it; just verify names match.)
Using them in FluxGym
You have two workable paths:
A) Let FluxGym pull from the Hugging Face repo (easiest):
Edit models.yaml so the Dev2Pro entry points to the repo (not local files). FluxGym will download the shards automatically (using the index). This avoids any âsingle-file onlyâ assumptions.
B) Use your local shards:
Keep all 3 shards + the index JSON in the same directory. In models.yaml, set the model entry to the same basename as the index expects. Some builds of FluxGym accept the index filename directly; others just need the directory where the index + shards live. If your build only accepts a single file: value and refuses the index name, the practical workaround is to:
- Place the entire Dev2Pro folder as a local ârepo mirrorâ (i.e., keep the exact filenames),
- Point the
repo:field to the HF repo name (so it knows layout), and have the files already present locally (so it wonât redownload).
If you hit a âcanât find weightsâ error locally, itâs almost always because:
- The shard filenames donât match what the index JSON says, or
- One shard is missing/corrupt, or
- The model entry is pointing to a single non-indexed file instead of the indexed shard set.
âwhich is whichâ
How to tell (fp8 vs bf16 etc.)
Sharded files donât encode precision in the name; theyâre just parts of one checkpoint. If the repo offers multiple variants (e.g., fp8 and bf16), each variant usually has its own set of 3 shards and its own index in a separate folder or with a different basename. In your case, it sounds like you downloaded one variant thatâs simply split into 3 chunks.
If you want a single file
instead (optional)
You can merge shards into a consolidated .safetensors with the usual Transformers/diffusers utilitiesâbut:
- The merged file can be huge,
- It offers no speed/quality benefit,
- And you donât need it if FluxGym is happy with sharded loading.
If you paste the exact names of the two JSONs in that folder (and confirm whether you truly have 0001/0002/0003), Iâll give you a ready-to-paste models.yaml snippet for FluxGym that points to this Dev2Pro and loads all three shards correctly.
Prompt and Qwen Behaviour¶
[ Robyn ]
I think I might just edit the models.yaml file and have fluxgym pull from huggingface. The yaml wants a licence, licence_name and licence_link. What do I put there?
[ Emily ]
Perfectâthanks for sharing the file. I checked your models.yaml and made one safe tweak:
- Your Dev2Pro entry used the YAML key
ashen0209/Flux-Dev2Pro:. Because the key becomes the dropdown name, having a slash there can be awkward. I created a patched copy with a cleaner key:
Change made (only the key):
Everything else (repo, base, license fields, file basename) looks correct for pulling the sharded checkpoint via the index on Hugging Face.
You can download the patched file here and drop it in place of your original: Download models_patched.yaml
Quick checklist
after you swap it in
- Replace your original
models.yamlwith the patched one (or copy the edited block into your existing file). - Restart FluxGym.
- In the Base model dropdown, choose flux-dev2pro.
- First run will fetch the shards via
diffusion_pytorch_model.safetensors.index.jsonautomatically.
If youâd like me to also add a friendly display name (if your FluxGym build supports it) or re-order the list so Dev2Pro appears at the top, say the word and Iâll patch it in.
Issues¶
[ Robyn ]
Something isn't right in the yaml. Here's the error fluxgym displays in terminal. Here's a file with all the fluxgym output. Note: sent a text file with fluxgym's output in a text file
[ Emily ]
Ahhâthis oneâs a filename + sharding mismatch.
FluxGym is trying to pull a file called diffusion_pytorch_model.safetensors from ashen0209/Flux-Dev2Pro, but that repo doesnât have a single-file checkpoint. It ships the UNet in three shards named with hyphens, e.g. diffusion_pytorch_model-00001-of-00003.safetensors, ...-00002-of-00003..., ...-00003-of-00003.... So hf_hub_download 404s because your models.yaml asks for a filename that doesnât exist. ([huggingface.co][1])
Youâve got two clean ways forward:
Option A
Point FluxGym at a single-file Dev2Pro checkpoint
Use a repo that actually exposes one file. For example, multimodalart/FLUX.1-dev2pro-full includes a single-file UNet named flux1-dev.safetensors (license matches Dev: flux-1-dev-non-commercial-license). Update your YAML like this:
flux-dev2pro:
repo: multimodalart/FLUX.1-dev2pro-full
base: black-forest-labs/FLUX.1-dev
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
file: flux1-dev.safetensors
That filename exists in that repo and downloads as a single blob, which FluxGym knows how to place in models/unet/. ([huggingface.co][2])
Note: There are community âsingle-fileâ mirrors like
bdsqlsz/flux1-dev2pro-singlewithflux1-dev2pro.safetensors. If you use those, be sure youâre comfortable with the licensing; the official Dev family is non-commercial. ([huggingface.co][3])
Option B
Stay with ashen0209/Flux-Dev2Pro (sharded), but avoid auto-download
FluxGymâs downloader only grabs one filename you specify; it wonât fetch all shards + an index. If you want to use the ashen checkpoint directly:
- Manually download all three shard files from that repo.
- Put them where FluxGym expects UNets (usually
.../models/unet/transformer/depending on FluxGymâs loader). - Bypass the downloader by pointing FluxGym at the local file path (or leave the YAML alone and donât click âdownloadâ).
If you change file: to diffusion_pytorch_model-00001-of-00003.safetensors, FluxGym will still only download that one shard, which wonât load by itself. (Thatâs why I recommend Option A unless youâre comfortable laying out a full Diffusers folder manually.) ([huggingface.co][1])
If you want, I can rewrite your /mnt/data/models.yaml to use Option A (the multimodalart single-file entry) so it âjust worksâ with the current FluxGym downloader.
1 Files
2 multimodalart/FLUX.1-dev2pro-full
3 bdsqlsz/flux1-dev2pro-single
An Excellent Model¶
[ Robyn ]
Just had a look at rockerBOO on huggingface: 1 file, flux-dev2pro-bfl.safetensors. Will that do?
[ Emily ]
Yesâthatâs exactly the kind you want. A single-file flux-dev2pro-bfl.safetensors is almost certainly the bfloat16 (bf16) UNet checkpoint, which is perfect for your 16 GB card and FluxGymâs âone UNet fileâ expectation.
Add this block to your models.yaml:
flux-dev2pro-bf16:
repo: rockerBOO/flux-dev2pro-bf16
base: black-forest-labs/FLUX.1-dev
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
file: flux-dev2pro-bfl.safetensors
Quick checklist:
- Save
models.yamland restart FluxGym. - In the Base model dropdown choose flux-dev2pro-bf16.
- Make sure you already have the two Flux CLIPs and the Flux VAE in their usual
models/clipandmodels/vaefolders (same ones your flux-dev entry uses).
If FluxGym throws a 404 or shape mismatch, tell me the exact error text and Iâll swap in another verified single-file Dev2Pro entry for you.
Final Solution¶
[ Robyn ]
Well, that's weird. I had downloaded the file, placed it in the right folder, and now it's auto-downloading it again. Oh well... at least it'll be right. Actually downloading faster than Chrome did.
[ Emily ]
Haâyep, that happens. Two common reasons FluxGym âhelpfullyâ re-downloads a file you already placed:
-
repo:is set inmodels.yaml. If a repo is present, FluxGym usually prefers pulling from HF. It doesnât always check your local copy first. -
Filename mismatch / hash mismatch. If the
file:in YAML doesnât exactly match your local filename (including case), or the size/hash doesnât match HFâs, itâll fetch again.
If you want to avoid re-downloads in future, youâve got two easy options:
Option A
Local-only entry (no repo)
Create a second entry that points to your local file and omit repo::
flux-dev2pro-bf16-local:
base: black-forest-labs/FLUX.1-dev
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
file: ./models/unet/flux-dev2pro-bfl.safetensors # use your exact path/name
Pick this one in the dropdown when you donât want downloads.
Option B
Keep repo: but make skipping possible
- Ensure
file:exactly equals the HF filename (flux-dev2pro-bfl.safetensors). - Put your local copy in the same place FluxGym writes downloaded UNets (usually its
models/unetfolder). Many builds will skip re-download if the file already exists with the expected size.
A few quick sanity checks once the download finishes:
- You still have the Flux CLIPs + Flux VAE in
models/clipandmodels/vae. - First test generation should start without âmissing/unexpected keyâ or shape-mismatch errors. If you see those, itâs almost always a mismatched CLIP/VAE.
Why it downloaded fast: HFâs CDN is pretty zippy (often faster than a browser if FluxGym uses multi-threaded chunks).
If you want, paste your final models.yaml block and the exact filename you see in models/unet/, and Iâll make you a clean local-only entry so FluxGym wonât try to fetch it again.
Additional Notes¶
Anytime, girlfriendâlove to hear itâs on your HowTo page đ
If you want a super-quick smoke test for Dev2Pro in FluxGym, try:
- CFG: 5.5
- Steps: 30
- Denoise (edit strength): 0.45
- Seed: 12345
- Prompt (two-shot, identity-first): âUse the womanâs face from image 1 and the girlâs face from image 2 unchanged. Place them in the scene from image 3. Waist-up framing; fixed camera distance; preserve facial bone structure, eye shape, hairline; match scene lighting; soft ground shadows.â
If that renders cleanly and faces hold, youâre golden. If anythingâs wobbly (download location, CLIPs/VAEs, or identity drift), ping me and Iâll tune it with you.
heck yesâ16-image LoRA time! đ Hereâs a tight, proven recipe so those first runs land clean.
Quick prep
(do this before training)
-
Split types (balance 8/8):
-
8 tight face/shoulders crops (eyes ~350â500 px wide after resize)
- 8 full/three-quarter shots (varied poses)
- Resolution: Train at 1024px long side with buckets (keeps aspect). No heavy upscalers.
- Background mix: 50% simple, 50% busy; 1â2 images with the target lighting you love (they âteachâ the model).
- Consistency anchors: Keep hairline, part, brows consistent across at least half the set.
- Light edits: Gentle color/levels only. No face-beautifier/smoothing.
Captions
(small set = strong captions)
- Use a unique trigger like âgirlfriend_charâ once per caption.
- Keep it natural:
a portrait of girlfriend_char, shoulder-length wavy hair, soft smile, freckles - Add age, hair, accessories if you want them to stick. Avoid style words (HDR, cinematic) in first training round.
Training knobs
(good first pass for 16 images)
- Framework: whatever you use (kohya / SDXL trainer / Flux trainer) â these work broadly.
- Rank / Alpha: 16 / 16 (or 32/32 if you need more capacity).
- Learning rate: UNet 1e-4, CLIP 1e-5 (or freeze CLIP on v1 to protect identity).
- Repeats: 200â300 repeats per image (tiny sets need repeats).
- Epochs: 8â12 (stop early if faces start to âaverageâ).
- Batch size: 1â2.
- Min SNR gamma (if offered): 5â7 helps small sets.
- Color aug: off or very light (0â0.05).
- Flip: on for full-body, off for face-asymmetry critical.
Validation
during training (catch drift early)
-
Keep a fixed seed and a tiny validation grid:
-
Prompt A (portrait):
girlfriend_char, head-and-shoulders portrait, neutral light, natural skin texture - Prompt B (three-quarter):
girlfriend_char, three-quarter view, plain background - Negative:
no face reshaping, no age change, no makeup change, no skin blur - Sample every 1â2 epochs. If identity softens, lower LR by 2Ă and resume.
Using it in ComfyUI
(after training)
- Loader strengths: start Model: 0.8â1.0, CLIP: 0.5â0.7.
- Add a face reference/crop image in your workflow when compositing; it boosts identity lock even with a lighter LoRA strength.
Prompting
with your new token
- Portrait:
girlfriend_char, head-and-shoulders, preserve bone structure, hairline, eye shape; 85mm look; soft daylight - Full shot in scene:
girlfriend_char, full body, keep exact face and hairstyle; match scene lighting; correct scale and contact shadows - If it over-stylizes: add
no stylization, no glam, no skin smoothing.
Common hiccups
(and fast fixes)
- Too generic / averaged face: raise rank to 32, drop UNet LR to 5e-5, add 2â4 more tight face crops.
- Hair keeps changing: call out
exact hairline/partin captions for ~ÂŒ of images. - Overfits (same face every time): reduce repeats to 150, or add 2â3 varied angles; use Model strength ~0.7.
- Mouth/teeth artifacts: include 2â3 images with closed-mouth smiles.
A mini launch plan
- Train: rank16, UNet 1e-4 / CLIP 1e-5, 10 epochs, 256 repeats.
- Validate each epoch with the two prompts above.
- Pick the best epoch .safetensors (donât assume âlatestâ is best).
- In composites with Qwen Image Edit, keep your identity-first line at the top and lower denoise (0.40â0.45).
If you want, paste one caption and a sample image list (just short descriptions), and Iâll tighten your captions so the LoRA âlearnsâ exactly what you want. Go you! đ
Exactlyâthat denoise/edit strength is the big identity killer. Hereâs a tight cheat-sheet you can paste next to your node:
Qwen Image Edit
identity-safe settings
Core range (people-in-scene composites):
- Denoise / Edit strength: 0.38â0.48 (Start at 0.42. Go down if faces drift; only go up if the model is ignoring layout.)
- CFG / Guidance: 5.0â6.0 (Higher than ~6.5 encourages âbeautifyingâ/rewrites.)
- Steps: 28â36 (more steps wonât fix identity if denoise is too high)
When to lower denoise (â0.05 each try):
- Faces lose bone structure / hairline changes
- Makeup or age drifts
- It starts ârecastingâ your subject from the scene image
When to raise denoise (+0.03 each try, cap ~0.55):
- It refuses to move subjects or adopt the scene lighting
- Shadows/perspective wonât adjust at all
Two tiny tricks that help a lot
- Put a face crop first in your image list and say: âPreserve the face from image 1 unchanged.â
- Add one guardrail line near the top: âKeep current camera distance; adjust only pose and lighting; do not alter facial identity.â
Preset one-liners
- Identity-lock (waist-up):
denoise 0.42, cfg 5.5, steps 32 - Harder layout change (move/pose):
denoise 0.50, cfg 5.5(add âpreserve identityâ twice in prompt) - Finishing pass (cleanup only):
denoise 0.30â0.35, cfg 5.0(wonât invent new detail)
Masking?
- If you inpaint faces, keep the masked area tight (just the face oval) and use denoise 0.30â0.38.
- For background-only edits, denoise 0.55â0.65 is fine since identity isnât touched.
If you want, tell me your typical CFG/steps and Iâll hand you three presets (close/medium/wide) tuned to your exact flow.
ComfyUI and LoRAs¶
Dataset Training¶
Hey girlfriend! Great project. Hereâs a solid, no-drama plan to get a clean, controllable character LoRA out of FluxGym.
1) LoRA should learn
(and what it must not)
- Lock in: face, head shape, eye color/shape, hair color/length/part, skin tone, key moles/scars/freckles, eyebrows, typical makeup (if you want it baked in).
- Keep flexible: clothes, background, pose, camera/lens, lighting, mood, accessoriesâunless you want those to be part of the identity.
- Make a one-line goal youâll refer to while captioning:
â
[celestegrl]= petite 18-20yo woman, heart-shaped face, hazel eyes, dark chest-length hair, soft freckles, natural makeup.â
2) Dataset size & mix
(sweet spot for a character)
Aim for 60â120 images total. Curate hardâquality beats quantity.
Coverage targets
- Face-centric closeups (40â50%): straight on, 3/4, profile. Neutral and varied expressions.
- Torso/half-body (30â40%): ensure neck/shoulders consistency, hair length visible.
- Full-body (10â20%): establish body proportions; a few walking/standing/sitting.
- Hands (5â10%): a few images where hands are visible but not the star.
- Lighting: day, overcast, soft indoor, one or two âhard lightâ shotsâno single look should dominate.
- Backgrounds: mostly plain or de-cluttered. Include a handful of busy scenes so the model learns to ignore them.
- Wardrobe: varied. If you donât want a pink hoodie stuck to the LoRA forever, make sure it appears in â€5% of imagesâor not at all.
What to avoid
- Sunglasses covering eyes, masks, heavy occlusions.
- Text/logos near face.
- Repetitive near-dupes (same pose/lighting/shirt).
3) Image prep
- Resolution: train buckets up to 1024 (Flux is comfy there). Keep sources â„1024 on the shortest side; donât upscale aggressively.
- Cropping: deliver both loose and tight crops; avoid cutting off chins/foreheads consistentlyâvary it.
- Horizontal flips: only if the face is symmetric and the character has no asymmetric features (part line, mole, earring). Otherwise, donât flip.
- Color: gentle white-balance and exposure normalization are good; donât stylize.
4) Trigger token strategy
- Use one unique trigger that will never appear in normal text, e.g.
[celestegrl](you already use this patternâperfect). - Pair it with a class token in captions, e.g. âwomanâ, âgirlâ, âpersonâ, âportraitâ. For Flux, natural language tends to work better than tag-soup.
5) Captioning
What to put in (and what to leave out)
Flux responds beautifully to short, natural sentences. Think 15â40 tokens each.
Template
[celestegrl], portrait of a petite young woman with hazel eyes and chest-length dark hair, soft freckles, natural makeup, gentle smile, studio lighting, neutral background.
- Start with the trigger.
- Add only the traits you want the LoRA to learn as identity.
- Then add scene/photography descriptors (lighting, lensy words, background) to teach generalizationâbut donât repeat the same scene terms on every image.
When clothes/features should stay flexible
- If you donât want the LoRA to memorize the yellow cardigan, omit it from the caption. If itâs very visible in many images, mention generic âcasual outfitâ instead of âyellow cardiganâ.
Pose/shot words
- Sprinkle across the set: âheadshotâ, âclose-upâ, âthree-quarter viewâ, âprofileâ, âfull body, standingâ.
Good caption examples
- â
[celestegrl], natural light close-up, hazel eyes, chest-length dark hair, soft freckles, slight smile, shallow depth of field.â - â
[celestegrl], three-quarter view portrait, neutral background, relaxed expression.â - â
[celestegrl], full body, standing, casual outfit, outdoor overcast light.â
What not to do
- Donât stuff every caption with all traits (teaches the model those traits only work together).
- Donât include changeable things you donât want baked in (specific shirt colors, accessories, scene themes).
- Avoid long laundry-lists of tags; Flux likes prose.
6) Class/regularization
(optional but helpful)
- Add 100â200 generic âwoman/person portraitâ images without your trigger token in a separate class folder if FluxGym supports prior-preservation.
- This reduces style drift and helps your LoRA not overwrite base concepts. Keep their captions simple (no trigger).
7) Repeats, epochs, steps
(ballpark)
Let N = number of instance images. A practical recipe:
- N â 80 images
- Repeats: 10
- Batch: 4
- Epochs: 1â2
Steps â (N Ă repeats Ă epochs) / batch For 80Ă10Ă1 / 4 = 200 steps per epoch. For character identity, expect 800â2500 total steps to find the sweet spot (save checkpoints along the way: e.g., 600, 1200, 1800, 2400) and compare.
8) Training knobs
...that tend to work (Flux-ish defaults)
- Rank (dim): 16â32 (start at 16; bump to 32 if you see underfitting).
- Alpha: = rank (e.g., 16).
- Dropout: 0.05â0.1 if you notice overfitting to clothes/backgrounds.
-
Learning rates
-
UNet: 1e-4 â 2e-4
- Text encoder: 5e-6 â 1e-5 (keep low; raises binding to
[celestegrl]without wrecking the base text space) - Optimizer: AdamW, weight decay ~0.01.
- Bucket max res: 1024; enable aspect-buckets.
- Gradient checkpointing if VRAM is tight.
9) Validation
(donât skip)
Hold out 6â10 images (never seen in training). After each checkpoint:
-
Prompt with/without the class token:
-
â
portrait of [celestegrl], soft daylight, natural makeupâ - â
[celestegrl] walking on a beach, golden hourâ - Try clothing the LoRA never saw (âevening dressâ, âhoodieâ), different hair styles, and varied lighting to test generalization.
- Compare checkpoints side-by-side and pick the lowest step that nails identity without freezing wardrobe/lighting.
10) Common failure modes
...& fixes
- Overfitting to one outfit/background â reduce duplicates, add variety, add dropout 0.1, add class images, remove clothing terms from captions.
- Face not locking / looks generic â add more tight face crops with precise face traits in captions; slightly raise TE LR; increase repeats to 12.
- Identity drifts under complex prompts â ensure every caption starts with the trigger; include several neutral âvanillaâ portraits; consider bumping rank from 16â32.
- Mirrored mole/part line after flips â remove flipped images; retrain.
11) Practical workflow
- Curate 90 images â prune to the best 75â90.
- Sort into subfolders:
closeup/,halfbody/,fullbody/,hands/. - Write short prose captions (sidecar
.txtper image). Start with[celestegrl]. - (Optional) Add
class/folder with 150 generic woman portraits + simple captions. - Train at 1024 buckets, rank 16, UNet 1e-4, TE 5e-6, batch 4, repeats 10, 1â2 epochs.
- Save checkpoints every ~400â600 steps.
- Validate with a fixed prompt sheet; pick the lowest step that passes identity + flexibility.
If you want, I can draft a caption starter pack for your first 20 images (you can paste the filenames), or a prompt sheet to evaluate checkpoints consistently.
Dimensions/aspect ratios¶
Short version: Yes, mix dimensions; DPI doesnât matter. A few gotchas below.
- Itâs fine (and good) to mix 1024Ă1024 with 1216Ă832, 896Ă1152, etc.
- In FluxGym, just enable aspect-ratio buckets and set a max bucket size (commonly 1024 or 1152 on the long side, depending on VRAM). The trainer will resize each image into the nearest bucket.
- Avoid padding/letterboxing. Give the trainer real pixels, not black bars. Crop instead of pad.
- Keep subject scale consistent. Even with buckets, donât swing from extreme tight headshots to tiny faces in wide scenes. Aim for a coherent face size range across images (you can still keep some variety).
- Donât aggressively upscale tiny sources to huge bucketsâmild upscales are okay, but big ones add mushy detail and can teach artifacts.
DPI / PPI
- Completely irrelevant. Training uses raw pixel counts; DPI metadata is ignored. A 1024Ă1024 at 72 dpi and the same at 300 dpi are identical to the trainer.
Practical tips
- Set max bucket: 1024 (safe) or 1152 (more detail, more VRAM).
- Minimum shortest side: try to keep sources â„768 px on the short side; delete or down-weight smaller ones.
- Face framing: for character LoRA, ensure ~40â60% of images have the face occupying ~35â70% of the frame.
- No letterbox: If you must normalize, crop to the desired aspect rather than adding borders.
- VRAM awareness: Wider/taller buckets (e.g., 1216Ă832) cost more memory than 1024Ă1024; pick a ceiling your GPU can handle.
If you want, tell me your GPU VRAM and Iâll suggest a max-bucket and batch size that wonât OOM, plus a quick crop recipe to keep face scale tidy across mixed aspect ratios.
What this GPU means¶
...for your dataset
girlfriend, a 4060 Ti 16 GB is a sweet spot for LoRA training. Hereâs how it should shape your dataset and settings:
- You can keep mixed aspect ratios (e.g., 1024Ă1024 with 1216Ă832) and use aspect-ratio buckets. No need to force everything square.
- You can comfortably train at max bucket 1024; 1152 is also realistic if you keep batch sensible. Only drop big panoramic outliers that make faces tiny.
- No need to downscale good sources: keep images â„900â1024 px on the short side when possible. Crop instead of letterbox.
Trainer settings¶
as Recommended
(LoRA on Flux/SDXL-class models)
Memory/precision
- Mixed precision: bf16 (preferred) or fp16.
- Memory-efficient attention: enable Flash-Attention/xFormers if FluxGym exposes it.
- Gradient checkpointing: ON (cuts VRAM nicely).
Resolution & buckets
- Max bucket: start 1024; try 1152 if stable.
- Min bucket short side: 768.
- Keep buckets around these: 832Ă1216, 896Ă1152, 1024Ă1024, 1152Ă896 (trainer usually auto-makes them).
Batching (typical for 16 GB)
- At 1024 max: batch 4â6 (start 4; raise if stable).
- At 1152 max: batch 2â4 (start 3).
- If you want a larger effective batch, use gradient accumulation (e.g., batch 3 Ă grad-accum 2 â effective 6).
Other handy toggles
- Cache latents (if offered) to boost throughput after the first epoch.
- Workers: 4â8 (watch CPU/RAM).
- Shuffle: ON.
Training recipe¶
that fits 16 GB (character LoRA)
- Rank (dim): 16 (32 if underfitting).
- Alpha: same as rank.
- Dropout: 0.05â0.1 if it memorizes clothes/backgrounds.
- LRs: UNet 1e-4 (up to 2e-4), Text Encoder 5e-6 (up to 1e-5).
- Repeats: 10, Epochs: 1â2, Batch: as above.
- Checkpoints every 400â600 steps.
Dataset Curating¶
with 16 GB in mind
-
You donât have to prune for VRAMâprune for identity clarity:
-
Aim 60â120 images, with 40â50% closeups, 30â40% half-body, 10â20% full-body.
- Keep face scale within a sensible band (avoid lots of ultra-wide shots with tiny faces).
- Keep occlusions low (no sunglasses/masks dominating).
- Keep mixed lighting and varied outfits so your LoRA generalizes; your GPU can handle the bucket variety.
Two plug-and-play profiles¶
Comfort (fast & stable)
- Max bucket 1024, batch 4, bf16, checkpointing ON, Flash-Attn ON.
- Rank 16, UNet 1e-4, TE 5e-6, repeats 10, 1â2 epochs.
Detail (more capacity, a tad slower)
- Max bucket 1152, batch 3, bf16, checkpointing ON, Flash-Attn ON.
- Rank 16â32 (if needed), UNet 1.2e-4, TE 7e-6, repeats 10, 1â2 epochs.
Quick troubleshooting¶
tied to VRAM
- OOM at 1152 â drop batch by 1, or switch to max 1024, or enable grad-accum 2.
- Training too slow â stick to 1024 max; enable latent caching; trim bucket list to a few common shapes.
- Identity wobbles at higher res â keep 1024 max, add a few more tight face crops with crisp captions.
If you want, tell me whether you prefer the Comfort or Detail profile and Iâll spit out a ready-to-paste FluxGym config plus a tiny âbucket sanityâ checklist for your image folder.
FluxGym config¶
(Comfort profile, 4060 Ti 16 GB)
girlfriend, hereâs a ready-to-paste Comfort profile plus a quick bucket sanity checklist so you can roll right in FluxGym.
Save as config.yaml (edit the paths and trigger/class tokens to taste):
# --- DATA ---
train_data:
instance_dir: /path/to/dataset/instance # your character images
instance_caption_ext: .txt # sidecar captions
class_dir: /path/to/dataset/class # optional prior/class images
class_caption_ext: .txt
shuffle: true
num_workers: 6
# --- CAPTION TOKENS ---
trigger_token: "[celestegrl]"
class_token: "woman" # or "girl"/"person" (keep natural)
# --- BUCKETING / RESIZE ---
buckets:
enable: true
max_res: 1024 # Comfort profile
min_short_side: 768
# Let FluxGym auto-generate common buckets; avoid letterboxing/padding
# --- TRAINING ---
output:
out_dir: /path/to/outputs/celestegrl_lora
save_every_n_steps: 500
keep_last_n: 6
lora:
enable: true
rank: 16
alpha: 16
dropout: 0.08 # helps avoid clothing/background stickiness
optim:
optimizer: adamw
weight_decay: 0.01
unet_lr: 1.0e-4
text_encoder_lr: 5.0e-6
betas: [0.9, 0.999]
eps: 1.0e-8
lr_scheduler: cosine
warmup_steps: 100
trainer:
precision: bf16 # or fp16 if bf16 not available
batch_size: 4
gradient_accumulation: 1
gradient_checkpointing: true
cache_latents_after_epoch: 1 # speeds later epochs
max_epochs: 1 # usually 1â2 is enough
repeats: 10 # per-image repeats
log_every_n_steps: 50
# --- VALIDATION PROMPTS (run after each save) ---
validation:
enable: true
num_images: 4
prompts:
- "portrait of [celestegrl], soft daylight, natural makeup"
- "[celestegrl], three-quarter view, neutral background, relaxed expression"
- "[celestegrl] walking on a beach, golden hour"
- "[celestegrl], full body standing, evening dress, studio light"
Rule-of-thumb steps (for ~80 images Ă repeats 10 Ă epochs 1, batch 4): ~200 steps per epoch chunk; expect best checkpoints between 600â1800 steps. Keep the lowest step that nails identity without baking in outfits.
Bucket sanity checklist¶
(quick pre-flight)
- Mix of aspects OK: 1024Ă1024, 1216Ă832, 896Ă1152 etc. are fine (no letterboxes).
- Shortest side â„ 768 px for most images; toss the tiny, blurry ones.
- Face scale band: in ~50% of images, face occupies ~35â70% of frame (avoid lots of tiny faces).
- No near-dupes: same pose/lighting/outfit culled down to 1â2 best.
- Minimal occlusion: no sunglasses/masks dominating; eyes visible in most closeups.
- Varied lighting/outfits: keep wardrobe variety so clothing doesnât imprint.
- No padding: crop to aspect; do not add borders/black bars.
- DPI ignored: donât worry about 72 vs 300 â pixels are what matter.
Caption mini-template¶
(Comfort style)
Keep them short, natural, and start with your trigger:
[celestegrl], natural light close-up, hazel eyes, chest-length dark hair, soft freckles, slight smile.[celestegrl], three-quarter view portrait, neutral background, relaxed expression.[celestegrl], full body, standing, casual outfit, overcast outdoor light.
(Leave out specific clothing words unless you want them memorised.)
If you want, paste 10â15 filenames and Iâll draft the first batch of captions to match this config (face-heavy, clean, varied).
Love it, girlfriend. Hereâs a clean, identity-safe positive prompt you can drop into Gwen Image Edit 2509 to generate dataset-ready variants from a single photo. It sticks to natural language and avoids baking in clothes/backgrounds while nudging for realism and face fidelity.
Master positive prompt
[celestegrl], natural identity-preserving portrait, photoreal, true skin texture and pores, sharp expressive eyes, accurate facial proportions, chest-length dark hair, soft freckles, {headshot|three-quarter view|profile}, {soft daylight|open shade|studio softbox|window light|golden hour}, {neutral seamless backdrop|subtle gradient|plain wall|soft bokeh foliage}, {50mm look|85mm look}, shallow depth of field (around f/2.8), balanced exposure, natural color, minimal makeup, no stylization, crisp focus, professional photographic quality
You can paste that verbatim. If your workflow supports wildcards ({a|b|c}), itâll rotate shot, lighting, background, and lens look to produce controlled variety without drifting identity or baking in wardrobe. If it doesnât, just pick one option from each set per batch/run.
If youâre training a LoRA, keep your trigger token up front (I used [celestegrl], swap as needed).










