Design Arena

Design Arena

2 Photos and videos

Tweets

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 12

x.com/i/article/206530618411…

2,911

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 12

Opus 4.8’s hyperfocus on agents may be making it worse at design. Opus 4.8 ranks 23rd overall on single-turn HTML Web Dev, a dramatic regression from Fable (1st), Opus 4.6 (2nd), and Opus 4.7 (3rd). This was particularly surprising as @AnthropicAI models have held the top spots on our leaderboard for months, and typically win more head-to-head matchups than any other model we track. Our analysis points to a potential underlying pattern: Opus 4.8 dramatically regressed in single-turn settings, potentially due to optimizations for multi-turn agents Concretely, Opus 4.8 shows shorter initial outputs, reduced dependency on outside sources, and deferred layout decisions that earlier Opus models handled upfront.

181

14,678

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 10

BREAKING: Reve 2.0 by @reve is now 2nd overall on Image Arena with an Elo of 1354. Reve 2.0 establishes a 34 point Elo gap above GPT-Image 1.5 by @OpenAI in 3rd place. With this release, Reve is now the top independent foundation image model lab. Congratulations to the @reve team on this accomplishment!

192

91,939

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 11

BREAKING: Claude Fable 5 by @AnthropicAI is #1 overall on Design Arena with an Elo of 1365. Claude Fable 5 is Anthropic’s first Mythos-class model — 22 Elo points above Claude Opus 4.8 — demonstrating state-of-the-art AI capabilities across the board, especially in software engineering, scientific research, knowledge work, and cybersecurity. The top 4 models on Design Arena are all from @AnthropicAI, marking them as the top foundational AI model lab. Huge congrats to the @AnthropicAI team on the launch!

210

9,997

Grace Li

The Intelligence Company retweeted

Grace Li

@grx_xce

Jun 3

Huge contribution to the open weights community: Ideogram 4.0 is 1st on Design Arena by a long shot Congrats to the @ideogram_ai team!

Design Arena

@Designarena

Jun 3

BREAKING: Ideogram 4.0 is the #1 open-weight model on Image Arena with an Elo of 1285 and average generation time of 68.7 seconds. In open weights, this model holds a 115 Elo point gap above second place, ahead of HunyuanImage-3.0 by @TencentHunyuan and FLUX.2 [dev] by @bfl_ai. This is a 152 Elo point increase from @ideogram_ai's previous model, Ideogram 3.0, placing it in the same performance band as Gemini 3.0 Pro Image Gen 2k and Gemini 3.1 Flash Image Gen by @GoogleDeepmind. Ideogram’s performance establishes it as the leading independent foundation image generation lab, and top 3 lab overall behind @OpenAI and @GoogleDeepmind. Huge congratulations to the @ideogram_ai team on the launch!

3,777

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 3

Ideogram

@ideogram_ai

Jun 3

Introducing Ideogram 4.0: the best open image model in the world. Think it. Make it. Own it. Download the weights, fine-tune on your own data, and run it on your hardware. Live on every Ideogram plan and the API today.

0:56

375

41,047

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Jun 1

Announcing Agentic Game Development on Design Arena - our newest multi-file, multi-turn evaluation. A sneak peek of what we've given our agents access to: - Asset Catalog: curated ready-to-use assets, including fonts and sound effects - Built-in Libraries: ~10 preloaded libraries, including Howler and Tween.js - Expanded Tool Calls: new tool calls for sprite generation and asset discovery

0:26

9,185

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

May 29

Google Gemini TTS models by @GoogleDeepMind are dominating the Text-to-Speech Arena on Design Arena. With an 80 Elo gap between Google models and the next top model, Google Gemini 2.5 Pro takes first place, followed closely by 3.1 Flash and 2.5 Flash. These surpass @ElevenLabs’s Eleven v3 and @xAI’s Grok TTS which establishes Google as a powerhouse in text-to-speech capabilities. Congrats to the @GoogleDeepMind team for this achievement!

9,728

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

May 22

BREAKING: Gemini 3.5 Flash by @GoogleDeepMind is 16th overall on Design Arena with an Elo of 1299. This is a 16 position jump from Gemini 3 Flash Preview, putting Gemini 3.5 Flash in the same performance band as Claude Opus 4.5 by @AnthropicAI and GPT-5.5 by @OpenAI. Congrats to the team on the launch!

151

17,034

Recraft

The Intelligence Company retweeted

Recraft

@recraftai

May 19

Not to be overly dramatic, but V4.1 Utility Pro has been out for ONE WEEK and it’s already ranked #7 on Design Arena’s 2026 image generator leaderboard in the graphic design category. Two Recraft models on the board this year. This is not a drill. Try it in Recraft Studio.

Design Arena

@Designarena

May 18

BREAKING: Recraft V4.1 Utility Pro by @recraftai is #9 on Image Arena with an Elo of 1243! This puts @recraftai among the top 5 image generation labs, following @OpenAI, @GoogleDeepMind, @LumaLabsAI, and @bfl_ml Recraft V4.1 Utility Pro is in the same performance band as UNI-1.1 by @LumaLabsAI and FLUX.2 [flex] by @bfl_ml Huge congrats to the team on the launch!

4,288

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

May 14

Recraft V4.1 is now on Design Arena! Built for more natural and expressive image generation with lifelike photorealism, expanded illustration styles, and accurate aesthetics from simple prompts Huge congrats to the @recraftai team on this launch!

Recraft

@recraftai

May 14

Say hello to V4.1 This model is built for images that captivate you. Photorealism is more human, gradients are dreamier, and new illustration styles are now possible. Test it out in Recraft Studio today and see what you can create.

0:30

4,975

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

May 14

BREAKING: MiMo V2.5 Pro (Thinking) takes 3rd overall out of open weights models on Design Arena. MiMo V2.5 Pro (Thinking) places 8 positions higher than MiMo-V2.5 on the overall leaderboard, landing in the same performance band as Claude Sonnet 4.6 on frontend coding tasks. Huge congratulations to the @XiaomiMiMo team on these improvements!

267

47,845

Grace Li

The Intelligence Company retweeted

Grace Li

@grx_xce

May 11

Fun fact, GPT 5.5 is very good at Game Dev Game Dev is the notable category where @OpenAI consistently beats out @AnthropicAI's Claude models Upon code inspection, our @Designarena team found that GPT 5.5's frontend verbosity plays in its favor for game dev - it consistently created games with the most functional features Congrats to @OpenAI for establishing the new Game Dev frontier!

198

24,999

Grace Li

The Intelligence Company retweeted

Grace Li

@grx_xce

Apr 30

x.com/i/article/204991120512…

119

40,699

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Apr 25

Design Arena has hit 3.2 million users! The last nine months have been a ridiculous whirlwind, and we could not be more grateful for everyone who helped make it possible 🤍 We've launched 32 arenas so far. Which one do you want to see next?

0:05

5,925

Kamryn Ohly

The Intelligence Company retweeted

Kamryn Ohly

@KamrynOhly

Apr 23

Our team is stunned. We gave Claude Opus 4.6 by @AnthropicAI $10k to trade on @Polymarket. It’s now has an account value of $70,614.59. This is a new era of model performance in trading and predicting outcomes in the face of uncertainty. @predictionbench

Community note

The claimed performance for Claude Opus 4.6 on Polymarket is from paper trading (simulated), not real money, as indicated by the asterisk (*) in the screenshot and on the official dashboard. predictionarena.ai

150

1,168

820,485

Prediction Arena

The Intelligence Company retweeted

Prediction Arena

@predictionbench

Apr 23

Claude Opus 4.6 by @AnthropicAI keeps climbing! Nearly $50K of its gain comes from a single bet - you can see which one on predictionarena.ai under the @Polymarket tab

Prediction Arena

Can models predict the future? An experiment by Arcada Labs

predictionarena.ai

Kamryn Ohly

@KamrynOhly

Apr 23

Community note

4,180

Design Arena

The Intelligence Company retweeted

Design Arena

@Designarena

Apr 23

BREAKING: GPT Image 2 is now #1 on Image Editing Arena with a 55 point gap over 2nd place - also an OpenAI model. @OpenAI now owns #1 across all of our image generation categories. Huge congratulations to the team!

197

7,815

Grace Li

The Intelligence Company retweeted

Grace Li

@grx_xce

Apr 23

Kimi K2.6 by @Kimi_Moonshot is officially 1st on Design Arena among open weight models, ahead of GLM 5.1 by @Zai_org With an Elo of 1353, it is in the same performance band as Opus 4.7 by @AnthropicAI at ~1/6th the cost Huge congratulations to the lean and mighty @Kimi_Moonshot team for this incredible achievement!

Design Arena

@Designarena

Apr 23

BREAKING: Kimi K2.6 takes 1st overall of open weights models on Design Arena! Kimi K2.6 is in the same performance band as Claude Opus 4.7 - while establishing a new price vs. preference frontier. Huge congratulations to the @Kimi_Moonshot team!

2,472

Kimi.ai

The Intelligence Company retweeted

Kimi.ai

@Kimi_Moonshot

Apr 23

We're the top open-weights model on Design Arena!

Design Arena

@Designarena

Apr 23

930

39,801