Requesty

Requesty

31 Photos and videos

Tweets

Requesty

@RequestyAI

Jun 9

Mythos Live on @RequestyAI !

467

Requesty

Requesty

@RequestyAI

Jun 1

Our friends at @MiniMax_AI are doing a tremendous job! Now available on requesty.ai

Requesty - AI Gateway for 400 Models

AI gateway and LLM router for 400 models. Intelligent routing, caching, failover, observability, and enterprise governance through a single OpenAI-compatible API.

requesty.ai

MiniMax (official)

@MiniMax_AI

Jun 1

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscrib… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days

153

Requesty

Requesty

@RequestyAI

May 18

The Coding Agent Economy. • $92 avg cost per active user / month • Claude powers 92% of all coding agent spend (up from 68%) • Cache hit rates jumped 52% → 86% requesty.ai/coding-agent-eco…

The Coding Agent Economy | Requesty

An empirical 12-month study of coding agent economics across 9 AI coding tools.

requesty.ai

167

Requesty

Requesty

@RequestyAI

May 15

The throughput density data suggests something counterintuitive: the highest throughput providers are not necessarily serving the largest requests. They are serving a massive number of relatively small generations extremely efficiently. A lot of AI infrastructure performance right now looks less like “big intelligence” and more like high frequency inference systems. Congrats @GroqInc requesty.ai/data/provider-th…

Provider throughput density, April 2026

How many tokens per second can each LLM provider sustain? In April 2026 on the Requesty gateway Groq led at 320 output tok/sec, 2.5× the next-fastest provider, attributable to its custom inference...

requesty.ai

126

Requesty

Requesty

@RequestyAI

May 14

The surprising thing in the latency data is how compressed the top providers have become. For a lot of workloads, the gap between “fast” and “slow” providers is now smaller than the variance introduced by tool calls, long context, and agentic execution itself. Model latency is starting to matter less than workflow latency. Congrats @xai requesty.ai/data/provider-la…

Latency leaderboard per provider, April 2026

Which AI provider has the lowest latency in April 2026? On the Requesty gateway xAI led p50 at 0.6 s, with Novita (0.8 s), Azure (1.0 s) and Mistral (1.4 s) close behind. Vertex (Claude) was the...

requesty.ai

Requesty

Requesty

@RequestyAI

May 13

Most AI teams have zero control over which models employees and agents can actually use. Today we’re launching Approved Models Access Lists in Requesty. You can now: • approve models org-wide • restrict models by API key or group • enforce regional/compliance policies • standardize model usage across teams AI governance is becoming critical infrastructure. youtu.be/L36O7ST0Hb4

Proper AI Access Controls for AI Teams

Most AI teams have zero control over which models employees and age...

youtube.com

Requesty

Requesty

@RequestyAI

May 13

The open source model market is consolidating much faster than expected. A handful of OSS families now dominate traffic share while most new releases barely register. The gap between “models people talk about” and “models people actually use in production” is getting very large. @deepseek_ai is still dominating! Jan → Apr 2026 data from Requesty ↓ requesty.ai/data/oss-family-…

Family share within OSS-routed traffic, Nov 2025 - Apr 2026

Which open-weight AI model is most popular in 2026? On the Requesty gateway, OSS-routed traffic went from Qwen-dominated in late 2025 (34-38% share in Nov-Dec) to DeepSeek-dominated in January 2026...

requesty.ai

Requesty

Requesty

@RequestyAI

May 12

The interesting metric is not tool call request share. It’s tool call token share. Once workflows become agentic, token consumption shifts dramatically toward tool execution: retrieval code output tool responses intermediate reasoning The number of requests can look normal while the token profile completely changes.requesty.ai/data/tool-call-t…

Token-weighted tool_calls share per provider, April 2026

What share of LLM output tokens is spent on tool calls vs chat? In April 2026 on the Requesty gateway, Anthropic emitted 38.8% of its output tokens on `tool_calls` vs 54.2% of requests, so agentic...

requesty.ai

Requesty

Requesty

@RequestyAI

May 11

One of the clearest signals of how people actually use AI might be finish reasons. Anthropic direct traffic is now 52% tool calls. OpenAI direct is just 3%. You can literally see the difference between conversational usage and agentic workflows in the data. April 2026 data from Requesty ↓ requesty.ai/data/finish-reas…

finish_reason mix per provider, April 2026

Which AI providers serve the most agentic traffic? In April 2026 Anthropic-direct returned `finish_reason = tool_calls` on 52% of successful completions on the Requesty gateway, about 2× the next...

requesty.ai

265

Requesty

Requesty

@RequestyAI

May 7

Now live on Requesty! requesty.ai/models/vertex/ge…

gemini-3.1-flash-lite-preview API – Pricing, Context Window & Benchmarks | Requesty

gemini-3.1-flash-lite-preview API pricing: $0.25/1M input, $1.50/1M output, 1.0M context. Gemini 3.1 Flash Lite Preview is the most cost-efficient model in the Gemini family,… Benchmarks, specs and...

requesty.ai

212

Requesty

Requesty

@RequestyAI

May 2

Something big dropping next week 👀

201

Requesty

Requesty

@RequestyAI

Apr 29

Claude Cowork now works with every model via Requesty Gateway. EU-only routing. ZDR. 300 models. requesty.ai/claude-cowork

Claude Cowork on Requesty

Every model. Your region. Your rules. 300 frontier models inside Claude Cowork through Requesty.

requesty.ai

382

Thibault Jaigu

Requesty retweeted

Thibault Jaigu

@ThibaultJaigu

Apr 23

Replying to @AnthropicAI

@AnthropicAI now allows Gateways to be connected to their Claude app! I've been using it since yesterday and it's awesome! docs.requesty.ai/integration…

332

Thibault Jaigu

Requesty retweeted

Thibault Jaigu

@ThibaultJaigu

Apr 13

100% on agent-ready docs @buildwithfern ! @RequestyAI

339

Z.ai

Requesty retweeted

Z.ai

@Zai_org

Apr 7

Special thanks to our launch partners, AI gateways, and inference providers. Access GLM-5.1 now: - OpenRouter: openrouter.ai/z-ai/glm-5.1 - Vercel: vercel.com/ai-gateway/models… - Requesty: requesty.ai/models/zai/glm-5…

324

60,736

Requesty

Requesty

@RequestyAI

Mar 25

200,000 European B2B software companies, only 100 made the Cloud Challengers 2026 list. @RequestyAI is one of them. We're building the gateway layer for enterprise AI.

152

Requesty

Requesty

@RequestyAI

20 Oct 2025

We just shipped tool call analytics 📊 The problem: Your AI agent is slow and expensive, but you have no idea which tool is causing it. Now you can see exactly: • Which tools are killing your latency • Where your money is going per tool • Success rates and failure patterns

552

Thibault Jaigu

Requesty retweeted

Thibault Jaigu

@ThibaultJaigu

26 Sep 2025

Extremely excited to share that we've raised a $3m seed round from @20vcFund , @TapestryVC and Insiders! Thank you for the support @codorniou @HarryStebbings @alexandre_dewez @Kieranleehill businessinsider.com/pitch-de…

This AI startup helps developers safely and cheaply build on top of LLMs from OpenAI and Anthropic....

Requesty says its tech tightens security and brings down the costs of accessing large language models. It raised $3 million.

businessinsider.com

163,572

Requesty

Requesty

@RequestyAI

27 Aug 2025

Group & User-Based Limits for our Enterprise customers! Now you can set spend caps and quotas directly at the user or group level, no more relying only on API keys. Why this matters: ✅ Enforce limits per individual user ✅ Apply group-enforced limits synced from Okta or Azure

574

Requesty

Requesty

@RequestyAI

26 Aug 2025

New in Requesty: Latency-Based Routing We just launched latency-based routing for all Requesty users! Now, instead of relying on fallback chains, you can route requests to the fastest available model in real time.

425