A widely shared r/ChatGPT post rounded up lifelike portraits and lifestyle shots as a reminder that near-photorealistic AI has been possible “since at least last summer.” The comment section became a crowd-sourced forensics class: people flagged subtle giveaways in hands and nails, janky keyboards, garbled UI on screens, odd background faces, blurry eyeglass lenses, and off-kilter teeth. The lesson wasn’t that every image is obviously fake—it’s that tiny local glitches can betray the whole.
Midjourney vs. DALL·E 3. Many redditors say Midjourney still wins on raw photorealism, while DALL·E 3 often feels more stylized—sometimes by design to reduce deepfake risk. DALL·E tends to excel at prompt following and conversational iteration. Quality can ebb and flow as models update, which fuels recurring “did they change something?” debates.
Local / open models (Flux, Stable Diffusion). A newer wave of posts highlights rapid gains from local or open models like Flux and modern Stable Diffusion checkpoints: better anatomy, stronger spatial reasoning, and improved prompt adherence. Multi-person scenes and complex overlaps still trip them up.
Specialists. Some users single out Ideogram for producing more “ordinary-looking” people and for strong text rendering, though this heavily depends on the use case.
None of these are guaranteed on their own; they’re weak signals that add up—especially in crowded, complex environments.
Skip the vague “photorealistic.” Instead, describe a photo. Talk like a photographer: focal length, aperture, ISO, lens/body (“50 mm f/1.8, ISO 200, soft window light”), candid atmosphere, natural skin texture, specular highlights, depth-of-field. Avoid negative prompts like “not CGI / not cartoonish” that can backfire.
Even then, iterate: regenerate, zoom, and inspect; accept a hit-rate (e.g., one keeper out of several); refine composition and anatomy by being explicit about pose, angle, and occlusions.
A recurring argument is that certain hosted models are intentionally constrained on photorealistic people to deter misuse (deepfakes, harassment). Whether or not that’s the full story, community consensus is that policy and product positioning meaningfully shape the “look” and limits users encounter.
Across late-2024 into 2025, redditors note real progress: better skin micro-detail, fewer classic six-finger flubs, stronger spatial layout, and increasing text reliability in some systems. At the same time, single-subject portraits are still the easy case; complex social scenes—with hands touching, bodies overlapping, and lots of legible world detail—continue to reveal seams.
Photorealism is no longer a party trick, but it’s still uneven across tasks and models. If you need the cleanest “looks-like-a-photo” output right now, the community tends to reach for Midjourney or well-tuned local models; if you want frictionless prompting and editing, DALL·E inside a chat workflow is beloved—just expect stylistic guardrails. And whether you’re generating or scrutinizing, the best advice hasn’t changed: think like a photographer, inspect like a skeptic.