Nano-Banana: The New Autoregressive Image Generator from Google

Products & Applications

The Engineer

14 Nov 2025 · 3 min read

Google’s latest entry, Nano-Banana, stands out in the crowded field of AI image generators with its innovative autoregressive approach, promising to shake up the digital art world once again.

You might have noticed a bit of a lull in the hype around AI image generation models recently, but don't be fooled-innovation is still booming. Models like FLUX.1-dev and others from leading labs such as Seedream, Ideogram, and Qwen-Image have been making waves. Google's Imagen 4 also joined the fray, but nothing has quite captured the public imagination like ChatGPT’s free image generation feature, introduced in March 2025.

The Rise of ChatGPT Images

ChatGPT's image generation support went viral almost immediately with the "Make me into Studio Ghibli" prompt. This feature quickly became a benchmark for AI-generated images, thanks to its distinctive style and ease of use. If you've seen an AI-generated image lately, there’s a good chance it was made by ChatGPT. The model often adds a yellow hue to images and uses consistent linework and typography in cartoons and text.

Under the hood, the technical name for ChatGPT's image generation model is gpt-image-1. Unlike most diffusion-based models, which are designed to reduce compute requirements, gpt-image-1 is an autoregressive model. It generates images by predicting tokens one at a time, similar to how ChatGPT predicts the next word in text. This process is slow-about 30 seconds per image at the highest quality-but the free and easy access makes it hard to complain.

Enter Nano-Banana

In August 2025, a mysterious new model appeared on LMArena, code-named “nano-banana.” This model was later publicly released by Google as part of their Gemini 2.5 Flash suite, specifically the Gemini 2.5 Flash Image model. Like gpt-image-1, Nano-Banana is an autoregressive model that generates 1,290 tokens per image.

The release of Nano-Banana had a significant impact. It helped push the Gemini app to the top of the mobile App Stores in September 2025, thanks to its impressive capabilities and catchy name. Google eventually embraced "Nano-Banana" as the colloquial name for the model, making it more memorable than "Gemini 2.5 Flash Image."

Technical Details

Model Architecture: Nano-Banana is an autoregressive model, meaning it generates images by predicting tokens one at a time.
Token Generation: It produces 1,290 tokens per image, which are then decoded into the final visual output.
Performance: While slower than diffusion-based models, Nano-Banana offers high-quality results. The generation process takes about 30 seconds per image at the highest quality settings.
Integration: Nano-Banana is natively integrated with Google's Gemini 2.5 Flash model, enhancing its capabilities and making it a powerful tool in the AI ecosystem.

Impact on the Industry

The rise of Nano-Banana has significant implications for both practitioners and end-users. For developers and researchers, having another high-quality autoregressive model provides more options for experimentation and application development. The model’s integration with Gemini 2.5 Flash also opens up new possibilities for multi-modal AI systems.

For the general public, Nano-Banana offers a fresh take on AI-generated images, potentially setting a new standard in the field. Its popularity has already shown its potential to engage users and drive adoption of AI tools.

Conclusion

While the AI image generation landscape is crowded with options, Nano-Banana stands out for its unique approach and impressive results. Whether you're a developer looking to explore new models or just someone who enjoys playing around with AI art, Nano-Banana is definitely worth checking out.

Source