Umer Farooq

Umer Farooq

23 Photos and videos

Tweets

Umer Farooq

@Dev__F

Jun 13

Very DBZ-coded: less solo fighter, more fusion form. This is where AI is heading: not one model, but intelligently orchestrated models. OpenRouter’s Fusion API is interesting because it treats reasoning like a system design problem , parallel model analysis, then a judged synthesis.

OpenRouter

@OpenRouter

Jun 13

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

Umer Farooq

Umer Farooq

@Dev__F

Jun 10

Hope so this vibe coded harness doesn’t comes with any vulnerability! Will test it and post feedback here

Fuli Luo

@_LuoFuli

Jun 10

A strong model evolution needs a solid harness system, and vice versa. 14 days, 5 people, one vibe-coding journey — and MiMo Code was born. It's open source: github.com/XiaomiMiMo/MiMo-C…

Umer Farooq

Umer Farooq

@Dev__F

Jun 9

Don’t stop giving these type of hints xD

Tibo

@thsottiaux

Apr 12

Imagine the alternate reality where we named GPT-5.4-Pro something like Fable.

Umer Farooq

Umer Farooq

@Dev__F

Jun 9

Anthropic dropped two Claudes today. Fable 5 is the public one. Mythos 5 is the same weights with the cyber safeguards pulled back, for vetted defenders and (soon) a small life sciences group. Same model, two release paths. Other labs will copy this tiered play within a year. Real numbers from the launch: - Stripe migrated 50M lines of Ruby in a day. Team estimate was 2 months. - Fable 5 leads Cognition's FrontierCode at medium effort, not just max. - $10 / $50 per M tokens. Less than half of Mythos Preview. - 3x the Slay the Spire win rate over Opus 4.8 with file based memory. - Hebbia finance benchmark: highest score of any model tested. The part I am chewing on: 30 day retention is now mandatory on Fable and Mythos traffic. "Not used for training, only for safety" is a claim, not a guarantee. Read the policy before you put customer data through it. And the under-5% classifier fallback rate sounds fine until your agent silently downgrades to Opus 4.8 mid-task. Build for it. Do not assume Fable end to end. If you ship agents, this is the one to test. The price drop alone changes the unit economics. anthropic.com/news/claude-fa…

Claude Fable 5 and Claude Mythos 5

Today we’re launching Claude Fable 5: a Mythos-class model that we’ve made safe for general use.

anthropic.com

136

Umer Farooq

Umer Farooq

@Dev__F

Jun 8

200 models, one API key, Docker-native deployment, and a marketplace. @gmi_cloud GMI Agent Box just dropped and it's basically "AWS for AI agents" except it actually works out of the box the agent infra race is getting real.

GMI Cloud

@gmi_cloud

Jun 8

Today, we are launching GMI Agent Box. A complete infrastructure stack for production-ready AI agents: native Docker, flexible deployment, 200 models under one API key, dedicated compute across regions, and a marketplace for distribution. Available now.

1:31

Umer Farooq

Umer Farooq

@Dev__F

May 29

How to use Codex remote using Codex Windows App? @OpenAI @sama

115

Umer Farooq

Umer Farooq

@Dev__F

May 28

Looked at the Opus 4.8 benchmarks. Anthropic literally called it "a modest but tangible improvement" in their own blog post. SWE-bench Verified went from 87.6 to 88.6. One point. SWE-bench Pro: 64.3 to 69.2, which sounds okay until you remember that still means failing 3 in 10 tasks. And GPT-5.5 still beats it on terminal/CLI work. The actual news is the fast mode price drop. 3x cheaper inference. That's Anthropic competing on cost, not capability. When the company that built the model can't find a stronger word than "modest," the benchmark scores aren't the flex people think they are.

Umer Farooq

Umer Farooq

@Dev__F

May 26

xiaomi just cut MiMo V2.5-Pro API prices by 99% this is a 1T param model that matches claude opus 4.6 on agentic benchmarks token plan users get 6-8x more usage, credits fully reset open weights, open prices. xiaomi is not playing around

221

Umer Farooq

Umer Farooq

@Dev__F

May 26

Most companies building "AI agents" are just wrapping GPT in a while loop and calling it autonomy. Real agents need state management, fallback logic, and domain-specific tooling. Everything else is a chatbot with a fancy landing page.

140

Umer Farooq

Umer Farooq

@Dev__F

May 25

Companies spend 6-8 months building an AI feature. We ship it in 6 weeks. Not because we cut corners. Because we've built the same patterns dozens of times. RAG pipelines, agent workflows, voice integrations, full-stack AI products - the architecture doesn't change, the business logic does. Most teams are solving problems that have already been solved. They just don't know it yet.

Umer Farooq

Umer Farooq

@Dev__F

May 24

Codex reset limit is such a pain, but Tibo makes up for it.

172

Umer Farooq

Umer Farooq

@Dev__F

May 24

the "every college student" part feels a bit extreme though, not everyone needs to be an AI expert to add value

233

Umer Farooq

Umer Farooq

@Dev__F

May 24

Honestly, the meta aspect of seeing how the builders use their own tool is underrated. That's where the real tricks live.

Umer Farooq

Umer Farooq

@Dev__F

May 21

Google announced "Antigravity 2.0" and "Gemini Spark" at I/O yesterday. Their entire pitch is basically "what if your AI agent could do things autonomously" meanwhile anyone who's actually run agents in production has been doing that for months on a €8 VPS the gap between big tech keynotes and what indie builders ship daily is absurd

230

Umer Farooq

Umer Farooq

@Dev__F

May 21

Umer Farooq

Umer Farooq

@Dev__F

May 21

github.com/microsoft/RAMPART

GitHub - microsoft/RAMPART: A pytest-native safety and security testing framework for agentic AI...

A pytest-native safety and security testing framework for agentic AI applications - microsoft/RAMPART

github.com

Umer Farooq

Umer Farooq

@Dev__F

May 21

Google just made Gemini 3.5 Flash 3x more expensive. $0.50 → $1.50 per million tokens. Almost Pro pricing. And it's not just them. OpenAI did 2x. Anthropic did 1.46x. Three labs. Same pattern. They're all testing how much we'll pay. Don't get comfortable with cheap APIs.

0:54

146

Umer Farooq

Umer Farooq

@Dev__F

May 21

Audio glasses with Gemini built in sounds cool but I'm wondering how the battery life holds up with constant voice interaction.

Umer Farooq

Umer Farooq

@Dev__F

May 21

Token burn on 3.5 flash is insane, I switched back to 3.1