Show HN: Ideogram 4.0 – open-weight 9.3B text-to-image model

pigcat 41 points 8 comments June 03, 2026
github.com · View on Hacker News

It's our new text-to-image model: a 9.3B single-stream diffusion transformer trained entirely from scratch. We focused heavily on controllability through structured JSON prompts, with strong text rendering, spatial awareness through bounding box guidance, and color palette control. It has the best text rendering of any open-weight model we've tested so far, and the NF4 quantized checkpoint runs on a single 24GB GPU. For more technical details and examples see our blog post: https://ideogram.ai/blog/ideogram-4.0/ We will be happy to answer any questions :)

Discussion Highlights (5 comments)

elpocko

Non-commercial license, you should not call that "open-weight". Words have meaning. And people are having a laugh at how censored the model is. https://old.reddit.com/r/StableDiffusion/comments/1tvtu2u/id... https://old.reddit.com/r/StableDiffusion/comments/1tvxhzv/id...

nuancebydefault

The galery of generated images looks amazing, it's hard (but often possible) to spot inconsistencies in detailed images.

vunderba

Nice to see another locally hostable model! It’s going to take me a bit longer to add this model to the GenAI Showdown benchmark [1], since I’ll need to add a bit of customization so it produces highly optimized JSON-structured prompts. It might be worth noting that fal.ai [2] ( a fairly popular router in the generative AI space ) doesn’t really mention or emphasize the JSON-structured prompt format, and seems to suggest it works just as well with natural language. It might be worth reaching out to them, at least to clarify this point and make things a bit clearer. [1] - https://genai-showdown.specr.net [2] - https://fal.ai/ideogram-4

b3ing

Will it work on Apple silicon machines? Maybe in the Draw Things application? Or is it all command line

Frannky

What are the best alternatives for template + text rendering? I need Canva-level templating ability with the system matching font dimensions, positions, and the image without background. I ended up creating an algo using color variance and auto-sizing, an LLM to select the fonts and text and nano banana. The thing I need is nano-banana-level images with Canva+human-level abilities automated. It would be awesome to have a free LLM that can do that. Running on a 16GB RAM Mac. I have a lot of images and templates I can train on too. Not sure if I overengineered it, would love to have just one LLM that can do everything for free.

Semantic search powered by Rivestack pgvector
10,002 stories · 93,925 chunks indexed