Generative AI Image Editing Showdown

2025-10-2820:5834277genai-showdown.specr.net

A comparison of various SOTA generative image models on specific prompts and challenges with a strong emphasis placed on adherence.


Read the original article

Comments

  • By minimaxir 2025-10-2821:3511 reply

    Everyone is sleeping on Gemini 2.5 Flash Image / Nano Banana. As shown in the OP, it's substantially more powerful than most other models while at the same price-per-image, and due to its text encoder it can handle significantly larger and more nuanced prompts to get exactly what you want. I open-sourced a Python package for generating from it with examples (https://github.com/minimaxir/gemimg) and am currently working on a blog post with even more representative examples. Google also allows generations for free with aspect ratio control in AI Studio: https://aistudio.google.com/prompts/new_chat

    That said, I am surprised Seedream 4.0 beat it in these tests.

    • By daemonologist 2025-10-2822:191 reply

      I don't think people are really sleeping on it - nano-banana more or less went viral when it first came out. I'd argue that aside from the capabilities built into ChatGPT (with the Ghibli craze and whatnot) craze it's the best known image editing model.

      • By minimaxir 2025-10-290:081 reply

        It's a weird situation where the Gemini mobile app hit #2 on the App Stores because of free Nano Banana, but no one ever talks about it and most disclosed image generations I've seen are still ChatGPT.

        • By ec109685 2025-10-293:052 reply

          Google photos should just include the feature. It’s kinda buried in Gemini.

          Google is so weirdly non-integrated.

    • By vunderba 2025-10-290:56

      > That said, I am surprised Seedream 4.0 beat it in these tests.

      OP here. While Seedream did have the edge in adherence it also tends to introduce slight (but noticeable) color gradation changes. It's not a huge deal for me, but it might be for other people depending on their goals in which case NanoBanana would be the better choice.

    • By cosama 2025-10-2822:582 reply

      I was trying to use gemini 2.5 flash image / nano banana to tidy up a picture of my messy kitchen. It failed horribly on my first attempt. I was quite surprised how much trouble it had with this simple task (similar to cleaning up the street in the post). On my second attempt I had it first analyze the image to point out all the items that clutter the space, and then on a second prompt had it remove all those items. That worked much better, showing how important prompt engineering is.

      • By veunes 2025-10-319:41

        That actually proves how important the “number of attempts” metric is. It’s not just a “make everything pretty” button - it’s more like a powerful but slightly dumb intern who needs clear, step-by-step instructions. Your two-step approach really captures the essence of prompt engineering

      • By vunderba 2025-10-296:39

        Yeah, that's part of the reason I list the number of attempts as part of the stats for each model + respective prompt. It's a loose metric of how "steerable" a given model is, or put another way, how much I had to fight with it before we were able to get it to follow the prompt directives.

    • By herval 2025-10-2822:152 reply

      Gemini is great when it gets it right, but in my experience, it sometimes gives you completely unexpected results and won't get it right no matter what. You can see that in some of the examples (eg the Girl with the pearl earring one). I'm constantly surprised by how good Flux is, but the tragedy is most people (me included) will just default to whatever they normally use (chatgpt and gemini, in my case), so it doesn't really matter that it's better

      • By tigershark 2025-10-295:40

        Flux kontext quality is noticeably worse that nano banana, Qwen image 2509 and Seedream 4 most of the times. For pure image generation instead Hunyuan image is scarily good.

      • By dimitri-vs 2025-10-2822:59

        Agreed, to the point where I built my own UI where I can simultaneously generate three images and see a before/after. Most often only one of three is what I actually wanted.

    • By epiccoleman 2025-10-2914:142 reply

      half the time when i try to use nano banana, AI Studio fails, telling me it can't generate for some unspecified reason.

      these aren't cases where I'm trying to do something that skirts the edge of copyright, either (like "Ghiblifying" images, for example).

      that said, when it does work, it is super impressive.

      • By minimaxir 2025-10-2916:36

        Let's just say I've tested around this.

        Copyright: Zero guardrails on anything related to third-party IP, which lets you do some funny things. (I'm including a picture/prompt of Super Mario, Mickey Mouse, and Bugs Bunny partying at a nightclub in the blog post)

        Moderation: It has far fewer guardrails and any other Google AI product I've tried, and it is possible to prompt engineer some images that would definitely be considered NSFW by most people — more NSFW than actual NSFW image generators (a post-generation filter will catch most nudity, however). I have not had any rejections for more innocous queries that could be misinterpreted as being NSFW.

      • By vunderba 2025-10-2914:331 reply

        It might be the safety moderation system. It's rather aggressive and when it does kick in (at least in the API), it often returns an empty response giving basically zero indication as to the root cause.

        • By minimaxir 2025-10-2916:38

          The empty response issue is annoying since there is already a PROHIBITED_CONTENT flag, but it is not used in this case.

    • By BoorishBears 2025-10-290:39

      No one is sleeping on nano-banana/Gemini Flash, it's highly over-tuned for editing vs novel generation and maxes out at a pretty low resolution.

      Seedream 4.0 is somewhat slept on for being 4k at the same cost as nano-banana. It's not as great at perfect 1:1 edits, but it's aesthetics are much better and it's significantly more reliable in production for me.

      Models with LLM backbones/omni-modal models are not rare anymore, even Qwen Image Edit is out there for open-weights.

    • By veunes 2025-10-319:31

      Gemini likely has a more powerful text encoder, which is why it's better at parsing complex, nuanced prompts. Seedream, on the other hand, might have a more advanced diffusion U-Net architecture that's better at preserving textures and handling local edits. One model understands better, the other draws better

    • By tigershark 2025-10-295:33

      Seedream 4 is better than nano banana on average, so that test result seems accurate to me

    • By franze 2025-10-2914:031 reply

      honest question: where is / how to do aspect ratio control for nano banana in aistudio?

      • By minimaxir 2025-10-2915:32

        It's on the right sidebar if Nano Banana is selected.

    • By cpursley 2025-10-2823:14

      Meh, most Google AI products look great on paper but fail in actual real scenarios. And that ranges from their Claude Code clone to their buggy storybook thing which I really wanted to like.

  • By lxe 2025-10-2822:131 reply

    This is vastly more useful than benchmark charts.

    I've been using Nano Banana quite a lot, and I know that it absolutely struggles at exterior architecture and landscaping. Getting it to add or remove things like curbs, walkways, gutters, etc, or to ask to match colors is almost futile.

    • By estetlinus 2025-10-2822:49

      I am trying Qwen Image Edit for turning day photos into night, mostly architecture etc. Most models are struggling, and Nano Banana misses edges and stuff, making the pictures align poorly.

  • By roenxi 2025-10-2823:43

    It is fun being one of the elderly who set their standards back in distant 2022. All these demos look incredible compared to SD1, 2 & 3. We've entered a very different era where the models seem to actually understand both the prompt and the image instead of throwing paint at the wall in a statistically interesting manner.

    I think this was fairly predictable, but as engineering improvements keep happening and the prompt adherence rate tightens up we're enjoying a wild era of unleashed creativity.

HackerNews