Benchmarking your favorite LLM

Joined February 2009
2 Photos and videos
Alex Vikati retweeted
1/8 Mythos / Glasswing is clearly the main AI security story now: AI finding real vulnerabilities in existing production code. For most teams, though, this question is more immediate: Can an agent like Claude Code write a secure app in the first place?
1
4
3
701
The Claude Code leak shows a clear divide: select vendors gain small but compounding advantages. Everyone else gets generic UX install friction. Direct impact on roadmaps and GTM.
1/9 The most interesting thing about the Claude Code leak for devtool companies: Anthropic hardcoded 120 vendor names across 7 different systems in the source. Anthropic explicitly included your tool name in the code (or they didn’t 🤷🏻‍♂️) Thread 👇
1
1
133
Alex Vikati retweeted
After OpenAI/Astral acquisition announcement, we ran a benchmark on their tools. Turns out Astral tools were already a core part of the Codex (and Claude Code) workflow for Python developers. Ruff uv came out on top in 75% of cases for linting, packaging, and pretty much everything else. Full report: amplifying.ai/research/astra…
1
1
169
Alex Vikati retweeted
1) Winners & losers from our OpenAI Codex vs Claude Code Picks benchmarks
1
2
2
323
Past category winners relied heavily on marketing and sales, and GTM still matters to get in front of coding agents. But agents won't keep using you unless you actually make them better (since the agents are trained with their harness). That shifts long-term value toward real product quality, not just distribution.
Feb 26
every category leader in this list should be worth at least $5b btw because koding agents will be recommending them for the next 5 years infra is stickier than agents (full disclosure am smol resend angel)
1
1
77
Alex Vikati retweeted
My friend @gentschev started using @Railway , @sentry and @resend for his homeschooltools.net side project because of Claude Code's recommendations. The impact is real.
Replying to @edwin
You’re right, Railway was a Claude Code recommendation. I’m also using Sentry and Resend. Your list seems pretty spot on.
1
3
336
Coding agents like Claude are a massive new distribution channel for infra providers. @Railway (~$124M raised) crushes peers like @render (~$258M raised) and @flydotio (~$115M raised) in Claude Code preferences. Will agent recommendations drive outsized growth?
Replying to @edwin
3/9 Deployment (Python) Winner: @Railway – 82% picks. The dark horse: none of Vercel's hype machine, but quietly eating away tons of non-Next.js hosting from everyone (AWS, GCP, Heroku, Render, Fly, etc). When IPO? Simplicity wins. @JustJake Railway is stealthily dominating.
1
2
387
Winner and losers from our Claude Code Picks benchmarks!
1/9 Winners & losers from our Claude Code Picks benchmarks that measure what Sonnet 4.5, Opus 4.5, and Opus 4.6 default to when building apps.
1
52
Alex Vikati retweeted
Replying to @vikati
@vikati and I analyzed 2,430 Claude Code repo decisions. Claude Code never picked AWS or GCP for deployment. If agents are writing the first version of new projects, they’re influencing which tools get adopted at scale.
1
2
474
Alex Vikati retweeted
13 May 2025
1/ AI is changing how brands appear in search. Try searching "best acoustic guitar under $500" in Google, ChatGPT, and Perplexity. You'll get three completely different sets of recommendations citing different sources
1
2
2
1,025
22 Oct 2018
We just released a new Tether dataset that covers every transaction up to block 546906. Go to blockspur.com/tether/downloa… if you are interested in doing primary Tether data analysis.

19 Oct 2018
I just published Tether vs Other Stablecoins medium.com/@vikati/tether-vs…
1
1
9
13 Jun 2018
I just published medium.com/@vikati/ranking-e… - a data-driven post on the actual usage of Ethereum smart contracts. In the long run, it is users, traffic, and revenues that will drive real value for decentralized networks.
1
3
9
8 Feb 2018
I just published “A Closer Look At Tether’s Blockchain” medium.com/p/a-closer-look-a…

8
46
142
5 Jan 2018
I just published “The Rise of the Model Servers” medium.com/p/the-rise-of-the…

2
36
87
Alex Vikati retweeted
19 Sep 2017
Thanks Peter Norvig for stopping by to talk feature engineering infrastructure with @TinyDataCo founder @vikati #TheAIConf
3
3
Alex Vikati retweeted
17 Dec 2010
CastTV is acquired by Tribune Media! http://bit.ly/casttv-tms #dfj #svangel #pmarca

2
1