FLUX.1, the newest artificial intelligence (AI) image generator, is sparking a lot of buzz across social media after photos generated by the AI feature unprecedented realism that made netizens think twice at first glance.
Created by Germany-based startup Black Forest Labs, FLUX.1 is an open-source text-to-image AI model that generates close-up photos of men and women speaking at conferences with naturalistic hair, wrinkles, limbs, and skin.
The new technology rivals Adobe Firefly, Midjourney, DALL-E, and Stable Diffusion in AI image generation. However, unlike these systems, FLUX.1 is building itself as the lead in terms of photorealism.
All About FLUX.1
FLUX.1 is the flagship product of Black Forest Labs, an up-and-coming AI company established by former researchers and engineers at Stability AI, with a recent seed funding worth over $31 million from high-profile venture capitalists.
The company developed the model with a sophisticated architecture that combined transformer and diffusion techniques to achieve a parameter size of 12 billion. This count is significantly greater than the Stability Diffusion 3 models, which max to 8 billion only.
FLUX.1 is also undergoing optimization by flow matching and other training processes. However, Black Forest Labs has not disclosed the source of its training data.
It was rolled out on August 1 as an open-source AI image generator, just like Stable Diffusion, meaning users can access the model locally and through online platforms like Poe, Nightcafe, Hugging Face, Replicate, and Fal.
Moreover, FLUX.1 can be used through Freepik, a stock image website, as part of an AI toolbox.
Three versions are currently available—Pro, Dev, and Schnell. Clients interested in a commercial license can subscribe to the Pro version, while those eyeing the model for non-commercial use can avail themselves of the Dev version. Meanwhile, the Schnell, the German term for fast, offers a faster version of FLUX.1.
Photorealism at its finest, so far
AI image generators are known to render human hands, legs, and limbs inaccurately because of insufficient training data. This weakness has made it easier for people to recognize whether an image was made by AI or not.
Now, that has been made less easy, thanks to FLUX.1.
“All FLUX.1 model variants support a diverse range of aspect ratios and resolutions in 0.1 and 2.0 megapixels,” Black Forest Labs explained.
The use of XLab’s Low-Rank Adaptation (Lora), a fine-tuning script, also enhanced the details and the photorealism of the resulting AI images. Many of the viral photos credited to FLUX.1 were refined by pairing the model with Lora.
“These TEDx speakers aren’t real. They’re made with the new Flux realism LoRA. Flux is founded by the core engineers of Stable Diffusion. It’s crazy good,” an X user wrote in a post.
While this breakthrough offers a game-changing opportunity for stock photography and advertising, many users were still quick to point out the imperfections that expose the AI side of photorealistic images.
Looking closely, texts found in the photos are often big giveaways, as do some patterns, textures, and proportions. Nevertheless, these flaws are not as easily evident compared to previous versions.
Moving forward, the company plans to release a text-to-video generator that can compete against OpenAI’s Sora, Runway’s Gen-3 Alpha, and Kuaishou’s Kling. “Our video models will unlock precise creation and editing at high definition and unprecedented speed.”