Rishabh Srivastava

Rishabh Srivastava

651 Photos and videos

Tweets

Pinned Tweet

Rishabh Srivastava

@rishdotblog

4 Dec 2025

I genuinely think we built the best search engine for official economic data. Been working on this for 6 months. We spent ~$100k in tokens to structure economic data and make it easier to search. It's answers economic data really well. From "What has been the actual impact of AI on software engineering jobs in the last 2 years?" to "Why did egg prices increase so much more than chicken prices in the last 5 years?" Would love feedback (the more blunt the better). We have a generous free tier for the next week!

FactIQ

@tryfactiq

4 Dec 2025

Excited to launch FactIQ today! 🚀 We just indexed 7.4M official US data series to build the ultimate economic research agent. Visualize trends instantly. Verify every source. Export charts for your reports. Free for the next week - try it out at factiq[dot]com!

1:28

145

23,601

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

22h

We abandoned training our own models at Defog in 2024 (despite 2M huggingface downloads) after o1 was released Our reasoning at the time - frontier models would get better and cheaper. It was just easier to use them via an API, and see gains automatically, rather than painfully handcraft our own Fully realizing the risks of that decision today. Going to spend at least a third of my time on fine-tuning frontier open-source models moving forward. Can't build a business on shifting sands, gotta own your model weights.

1,115

78,581

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Jun 13

Welp. Time to start getting better at finetuning and relying on open weights again

1,110

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Jun 12

brain feels fried. trying to fablemax while it's still subsidized on the max plan made more work progress in 2 days than i did all of last month. but constant context switching and "brain-must-go-brr" is exhausting

734

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Jun 11

friends don’t let friends use vector search without guardrails

NDTV

@ndtv

Jun 11

Blinkit Returns Chocolate Results For Typing Gibberish Like Toddlers, Customers Call It 'Terrifying' ndtv.com/offbeat/blinkit-ret…

967

85,620

FactIQ

Rishabh Srivastava retweeted

FactIQ

@tryfactiq

Jun 2

Today, we’re re-launching FactIQ. Six months ago, we launched a search and visualization engine for economic data. You could ask for a dataset, find the right series, and turn it into a chart. But over the last few months, the role of agents has changed. They are no longer just useful for saving a few minutes on repetitive work. They now act like a tireless second pair of eyes. Digging through data, testing competing explanations, and surfacing evidence you may not have looked out for. That opened up a much bigger opportunity for FactIQ. Macro analysts don’t just need another way to make charts. They need to answer the hardest and most important question in research: what’s actually happening? Answering that means looking beyond the obvious narrative. Beyond the single indicator everyone is watching. Beyond the chart that confirms what you already believed. The new FactIQ turns a macro question into an investigation. It breaks the question into the explanations that could be true, searches across official data, global institutions, government releases, news, and trusted industry sources, and tests which explanations are actually supported by the evidence. The goal is simple: give every macro analyst the capabilities of an institutional research desk. Use FactIQ to write macro notes, brief clients, support investment decisions, or pressure-test your view before publishing it. Try it today at factiq.com. We would love to know what you think.

1:30

1,997

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

May 28

I like Opus 4.8 so far! - way more token efficient than 4.7 - clearly better at financial analysis, dataviz, and writing - far less hedging and handwavy explainations - works extremely well in non anthropic harnesses, too Fantastic for finance/econ agents

2,298

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

May 28

Here's a report re AI and power in the US that it just produced with the @tryfactiq harness: factiq.com/share/a7eddc43203… Super sharp, and much much much better than what we were getting with 4.7

1,339

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

May 28

Jfc Asian public markets are retarded when it comes to AI Zhipu (GLM maker) is trading at a US$92B valuation, with ttm sales of ~US$100M (920x multiple) Wild to see a public co trading on Series A metrics The bubble is already here. It's just not evenly distributed.

7,866

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

May 20

Sadness. Gemini 3.5 Flash is as haunted as 3 Flash I _really_ wanted it to work. But it's totally broken in non-Google harnesses Way slower (and worse) than GPT-5.5/Opus at tool-chaining - despite the high output tok/s

Rishabh Srivastava

@rishdotblog

Jan 7

WTF did Google to do Gemini 3 Flash during post-training. It's a tortured model If given tasks it can't do (because of insufficient tools), it just... keeps trying. Making failed tool calls a 100 times. Even if explicitly told it that it's okay to give up in the system prompt.

4,452

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 27

didn't expect this, but codex with gpt-5.5 medium has become my daily driver only situations where i use something else are: - complex backend work (use gpt-5.5 xhigh for this) - initial ideation with vague prompts (claude code w opus 4.6) - UI work (opus 4.7)

5,035

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 27

codex xhigh just unslopified a gnarly file that was a 3000 line mess. had tried everything (opus 4.6, 4.7, 5.4, 5.3-codex) to refactor this. none of those had worked without causing new regressions or race conditions 5.5 xhigh one shotted it

479

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 26

for frontend agents, don't go from specs to code directly instead, to specs -> gpt-image-2 -> frontend code beats any coding agent out there. phenomenal tip from @reach_vb!

291

20,214

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 25

Early GPT-5.5 impressions - finally an OAI model that matches Claude at tool calls Until yesterday, Opus/Sonnet were the only reliable options if you wanted to build a fast, long-running agent. GPT-5.4 was good, but thought too much, and was super slow. It also polluted the context with too many thinking tokens and so had degraded performance at long-running tasks. Gemini models are... just weird at tool calling - they often get stuck in infinite loops. GPT-5.5 costs slightly less than Opus across the tool calling loop, is just as fast, and just as good. I like its personality more - much less hedging (specially for things like financial analysis) and more to the point. It's also much more broadly useful. Codex with GPT-5.4 was pretty good at code, but Opus was just better for general tasks. GPT-5.5 feels super competent across the board. Really excited for this release. Makes the LLMs for AI-agent market competitive again!

2,133

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 22

The last time I dealt with model regressions this bad was in the GPT-4 era. Sonnet is borderline unusable today. Anthropic will lose so much market share if they don't secure more compute, and fast.

1,021

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 22

Something _very_ weird is up with Sonnet 4.6 rn. Tool calls are totally broken 🫠

787

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 10

jfc this was less than 5 years ago slowly at first, then all at once

Rishabh Srivastava

@rishdotblog

12 Aug 2021

Nearly 1AM in Singapore, but super excited to try the @OpenAI Codex challenge (and play around with Codex) Starts in about 10 minutes – join in if you're free! challenge.openai.com

1,093

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 10

*this* used to be impressive LLM generated code back in the day x.com/rishdotblog/status/142…

Rishabh Srivastava

@rishdotblog

13 Aug 2021

Been playing around with Codex all day. It’s great at generating SQL queries from simple English. It got requests like this correct almost every single time! This brings us MUCH closer to sci-fi (Jarvis/LCARS) like computers that can give us data as we speak to them :D

568

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Apr 1

Welp, it's starting to happen. Under-discussed from the JOLTS report yesterday

507

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Mar 13

Sigh does everyone saying "it will be expensive to maintain AI generated code" not know about --dangerously-skip-permissions Give Claude Code / Codex a sandbox, let it poll for updates/canges, auto push and review PRs, set up CI/CD. Automated, better-than-human maintenance 🤷🏽‍♂️

1,363

Rishabh Srivastava

Rishabh Srivastava

@rishdotblog

Mar 11

sigh. i hope anthropic changes their auth provider soon (or better yet - use a homebrewed solution). their current one seems very broken

1,562