MTS @firecrawl โ€ข OSS git.new/devdigest โ€ข YT โ€ข Husband & Father of 3 ๐Ÿ’™๐Ÿ’œ

Joined January 2023
296 Photos and videos
Pinned Tweet
Open Lovable is live and now has 6,000 Github stars ๐Ÿ’œ ๐Ÿ’™ It's an Next.js app I built that instantly reimagines any website and generates full React apps in seconds. Powered by @firecrawl, @GroqInc, @e2b and more. Here's a complete breakdown of the project in 4 minutes๐Ÿ‘‡
19
47
447
49,909
Developers Digest retweeted
Prepare for takeoff. โœˆ๏ธ Flight simulator is now available globally on web to all users. goo.gle/4fBYnWO We've recently added many our most powerful professional desktop features to web. Elevation profiles, new import types, but there's always been one other feature you've been asking us to add to the web version of Google Earth, just for fun... Where will you fly? Share your best maneuvers, views, and flyovers with us!
412
3,689
29,169
7,824,883
๐Ÿค”
JUST IN: Andrej Karpathy, a top AI scientist at Anthropic, is reportedly barred from accessing the companyโ€™s most advanced AI model because he is not a U.S. citizen.
196
Developers Digest retweeted
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-mytโ€ฆ
12,239
25,463
86,449
85,534,852
Developers Digest retweeted
๐ŸŒ˜ Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! ๐Ÿ”ท Improved coding & agent performance over K2.6: 21.8% on Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. ๐Ÿ”ท Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. ๐Ÿ”ท Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. โšก๏ธ 6x High-Speed Mode coming soon! ๐Ÿ”Œ Available today via Kimi API and Kimi Code. ๐Ÿ”— Kimi Code: kimi.com/code ๐Ÿ”— API: platform.moonshot.ai
599
1,602
13,454
1,882,007
Developers Digest retweeted
We've updated the Artificial Analysis Coding Agent Index, replacing SWE-Bench Pro with Datacurve's DeepSWE benchmark - the swap lifts Codex with GPT-5.5 (xhigh) above Claude Code with Opus 4.8 (max), while the newly released Claude Fable 5 (max) in Claude Code debuts at the top DeepSWE, built by @datacurve, writes its tasks from scratch rather than adapting them from public GitHub issues or pull requests, so no model has seen the solutions during training. That matters because SWE-Bench Pro, the benchmark it replaces in our Coding Agent Index, had grown gameable, with some models recovering the fix from the repository's commit history instead of solving the task. The swap reorders the index: Codex with GPT-5.5 (xhigh) rises from 65 to 76, overtaking Claude Code with Opus 4.8 (max) at 73. Claude Code with Fable 5 (max), which enters directly on the refreshed index, leads at 77. SWE-Bench Pro had been flattering some combinations and penalizing others. More below.
104
184
1,884
524,858
Developers Digest retweeted
Replying to @emilheap @OpenAI
Yes you will now be able to control when Tibo's resets apply
1
3
22
593
Great addition โž•
enable provider side tool search with this new middleware! this allows you to attach TONS of tools to your agent and use progressive disclosure to not overload the context window
1
2
176
Developers Digest retweeted
Introducing Prometheus, an experimental Forward Deployed Agent for web data. Describe the web data you need and it writes Firecrawl code to collect it. Run it yourself or let us host and automatically maintain it as pages change. Try it with Claude Fable 5 for free this week!
35
40
531
66,566
Developers Digest retweeted
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)
184
571
6,026
3,453,310
Developers Digest retweeted
Jun 10
We partnered with @trajectorylabs to post-train NVIDIA Nemotron 3 Ultra for legal. Hereโ€™s what we found: 1) Open-weight models can reach frontier legal performance. On our Legal Agent Benchmark (LAB), Nemotron 3 Ultra started at a 0% all-pass rate. After post-training, it reached 5.8%, placing it between Sonnet 4.6 at 4.2% and Opus 4.6 at 6.6%. 2) Post-training dramatically improves reliability. Before training, many held-out tasks missed enough rubric dimensions to land around ~70% pass rates. After training, those tasks shifted toward ~95% pass rates. 3) Open-weight performance comes at much lower cost. Post-trained Nemotron 3 Ultra reached a similar quality band to leading closed models while running at roughly 1/8th to 1/50th the per-token price of Sonnet 4.6 and Opus 4.6. Most importantly: we post-trained this model on the @trajectorylabs platform less than 24 hours after Nemotron 3 Ultra launched, using the same harness, data, and recipe we used for Nemotron 3 Super. More to come as we continue to experiment with open-weight legal agents. Read more on post-training with Trajectory below:
1/ We post-trained @nvidia Nemotron 3 Ultra on @harvey Legal Agent Bench in under 24 hours. The result: an open model reaching the same band as leading closed models on legal work, at a fraction of the cost. The correlating story: when a new open model ships, Trajectory can turn it into a specialized agent almost immediately.
11
30
271
46,170
Developers Digest retweeted
.@tfadell: "Every new product needs three generations to get right. You make the product, then you fix the product, then you fix the business. Even the iPod, it took three generations before it became successful."
Tony Fadell's resume: Co-created the iPhone โ†’ $2.3 trillion in sales Created the iPod โ†’ saved Apple from bankruptcy Founded Nest โ†’ AI in your home 11 years before ChatGPT I asked him about everything he's learned: ๐Ÿ”ธ Why opinion-based decisions are essential for v1 products ๐Ÿ”ธ Why marketing matters as much as the product itself ๐Ÿ”ธ Why taste is the biggest moat in AI ๐Ÿ”ธ His prediction for the next breakthrough consumer device ๐Ÿ”ธ Why "cognitive surrender" to AI is the biggest risk for builders Listen now ๐Ÿ‘‡ youtu.be/RJjl1TwyfWM
20
17
194
62,887
Developers Digest retweeted
To celebrate the Fable 5 launch, we just reset 5-hour and weekly limits for all users across our products! Enjoy ๐Ÿš€
58
29
916
297,113
Developers Digest retweeted
We're betting on the next 1B users being agents, so we're launching agent signups. Ask your agent to add Firecrawl, instantly claim your API key, then pull web data in seconds. Works with Codex, Claude Code & Grok Build, all powered by auth.md from @WorkOS๐Ÿ”ฅ
23
29
398
43,496
Agentception
Just landed nested subagent support in Claude Code Starting to experiment more with agents kicking off agents as a way to better manage context. Capped at depth=5 to start, going out in todayโ€™s release. Lmk what you think!
1
228
Developers Digest retweeted

75
399
2,989
940,949
Developers Digest retweeted
Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40 hrs of work by leading open-source maintainers. Models write sloppy code that works but isnโ€™t maintainable. Our eval is first to measure: would you actually merge this code?
234
313
4,285
2,507,920
๐Ÿ’ฏ โ€œgood tools are cached intelligence for agentsโ€
Token costs are why there will be no saas apocalypse / good dev tools are cached intelligence for agents! The popular theory goes: agents can write code, so they'll just rebuild every tool from scratch and hit raw APIs. no more dev tools, no more CLIs, no more software layers. just agents and endpoints! We just tested this and the data says the opposite. We benchmarked Claude Code and Codex on real Hugging Face Hub tasks (~1,000 graded runs), with two setups: the agent-optimized hf CLI vs the agent hand-rolling curl or SDK calls from scratch. Hand-rolling burns up to 6x more tokens on multi-step tasks and fails more often (84% vs 94% task success). And that's just dropping one abstraction layer. It would obviously be orders of magnitude more tokens and a dramatically higher failure rate if the agent tried to bypass HF altogether and rebuild model hosting, versioning, and distribution from scratch. Every time an agent re-derives a workflow from raw API calls, you pay for that reasoning in tokens. every single run. a good CLI compresses that entire chain into a few high-level commands the agent can't get wrong. In a world where everyone is complaining tokens are too expensive, abstraction is leverage: thousands of hours of design decisions your agent doesn't have to re-reason about at inference time. Good tools are cached intelligence for agents! So no, agents won't rebuild everything from scratch. they'll gravitate to the most token-efficient tools, because that's what their owners pay for. The software that survives won't just be accessible to agents, it will be accurate and cheap for them to drive. We're seeing it happen with HF, which is becoming the platform for agents to use AI: ~49M requests in just two months, and growing fast! huggingface.co/blog/hf-cli-fโ€ฆ
1
1
318
Developers Digest retweeted
When I first became a dad I was genuinely worried my career would suffer. The opposite happened. 3 things changed that I wasn't expecting. First, a child cuts the filler from your life instantly. I used to sit at my desk for 14 hours and feel like I was crushing it when in reality maybe 4 of those hours were actual work and the rest was meetings that didn't need to happen, scroll sessions I told myself were research, and "quick calls" that turned into 90 minutes of nothing. A child deletes all of that overnight. Because you literally don't have the time anymore. Every hour matters in a way it didn't before. You could be with your kid, working on your startup, exercising, having dinner with your wife, sleeping. When your time is actually full of things you care about, the filler can't survive. I'm shipping more now than before my kid was born. Half the meetings. Faster decisions. I stopped saying yes to things out of politeness because my time has a very real cost now that I can feel in my bones. Second, your risk tolerance goes up, not down. Everyone assumes having a kid makes you play it safe. For me it created this urgency to build something real while my kid is young enough to not remember the hard parts. That urgency is more useful than any productivity system I've ever tried. Third, your thinking just gets clearer. I don't know how else to explain it. You stop deliberating for days and just make the call. You stop chasing every opportunity and only chase the ones that actually excite you. Something about being responsible for another human being gives you this filter that cuts through the noise instantly. Before my kid, I'd go back and forth on a decision for a week. Now I make it by lunch and move on. I used to think having a kid was the thing I'd do after I built the company. Turns out the kid made me better at building the company. Wish someone had told me that sooner. So I'm telling you. I know this sounds like something a new dad says to justify it. I thought the same thing when other dads told me. Then it happened to me and I understood. I think you will too.
197
87
1,464
160,257
Developers Digest retweeted
Jun 5
Software platforms are going to be rebuilt for agent-first.
910
793
9,986
617,384
Developers Digest retweeted
Huge milestone here. Firecrawl is making something agents want.
We've now fetched 8,000,000,000 pages at Firecrawl ๐Ÿ”ฅ A few other milestones in 2 short years: - 1.25M developers - 150K companies using us - 125K GitHub stars (top 100 repo) - 2.5M weekly downloads on npm PyPI. Thanks for building with us & we're just getting started!
32
22
629
92,294