Dev

Dev

14 Photos and videos

Tweets

Pinned Tweet

Dev

@debuggerdev

15 Aug 2025

“Meaning is not something you stumble across, like the answer to a riddle or the prize in a treasure hunt. Meaning is something you build into your life.”

1,209

Dev

Dev

@debuggerdev

May 5

You’re not "choosing to move" through spacetime You are a path through spacetime

Andrej Karpathy

Dev retweeted

Andrej Karpathy

@karpathy

Apr 30

This is the the quote I've been citing a lot recently.

kache

@yacineMTB

Feb 4

you can outsource your thinking but you cannot outsource your understanding

850

4,388

46,840

2,599,034

Dev

Dev

@debuggerdev

Apr 23

treat agents like a team, not a swarm you don’t want 5 engineers editing the same file at once you want: one person writing others reviewing, poking holes, suggesting directions multi-agent works the same way parallel insight serialized decisions

Dev

Dev

@debuggerdev

Apr 15

exploring local agent setups (openclaw local models) feels promising but unsure how it holds up beyond demos anyone here running it for real workflows?

Dev

Dev

@debuggerdev

Apr 13

LMs are great at knowledge, but they struggle one layer above it judgment what will actually work what’s just plausible but wrong i’m noticing this a lot they can generate many approaches but don’t really know which one is worth pursuing that part still falls on us

Dev

Dev

@debuggerdev

Apr 12

spent less time coding today but way more time deciding which AI output to keep which to throw away what actually makes sense feels like the work didn’t go away it just moved from typing to judgment

Dev

Dev

@debuggerdev

Apr 10

spent today on analytics bug fixes nothing exciting to show but these are the days where the product actually gets better

Dev

Dev

@debuggerdev

Apr 10

what looks like uneven AI progress isn’t randomness it’s optimization pressure clear reward, faster learning high $$$, more focus so some domains feel magical, others feel stuck

Andrej Karpathy

@karpathy

Apr 9

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

Dev

Dev

@debuggerdev

Apr 9

i thought AI would reduce mistakes but now i’m seeing a different problem it makes everything look correct nice structure confident output just enough to stop you from questioning it deeply

Dev

Dev

@debuggerdev

Apr 8

AI is making me accept things before i fully understand them spent 1 hour chatting with codex got a perfect looking solution read it once didn’t fully review it but it worked so I jus accepted it this would’ve scared me earlier now it feels like i’ll be slow if i don’t

Dev

Dev

@debuggerdev

Apr 7

are LLMs starting to assume too much? noticed this recently: claude/codex used to ask for missing context design decisions, edge cases, intent now they just… assume and move forward even when the repo clearly doesn’t have enough context feels faster but also more dangerous should agents pause and ask more when context is incomplete?

Dev

Dev

@debuggerdev

Apr 6

how are you guys handling memory / context with tools like codex or claude over longer sessions? i asked it to refactor some code it removed a fallback we had added earlier (for a real bug we already hit) reason? clean code same model added it before same model removed it later feels like it optimizes locally without remembering why something exists how are you dealing with this?

Dev

Dev

@debuggerdev

Apr 5

feels like we’re slowly realizing bigger models ≠ smarter models just better recall the interesting part is when smaller models start making better decisions with less context that’s when it starts to feel like intelligence

kitze the 🐐

@thekitze

Apr 5

chatgpt 5.4 vs gemma 4 😬

Dev

Dev

@debuggerdev

Apr 5

before AI, effort was a proxy for correctness if it took time, you probably understood it now effort is gone so you can’t use time as a signal anymore and that’s what feels uncomfortable

Dev

Dev

@debuggerdev

Apr 4

I asked Claude to: 1. write clean code with proper error handling 2. review it like a senior engineer it missed things in step 1 and caught them in step 2 same model same code we didn’t build a system that sees everything we built one that sees what you ask it to see

Dev

Dev

@debuggerdev

Apr 3

used AI to generate code way faster now but every time i ship something i didn’t fully understand there’s this small doubt like… did i actually build this or just convince myself it works feels like speed went up but trust went down

Dev

Dev

@debuggerdev

Apr 2

my entire feed is people digging into claude’s code trying to understand how it works but just a while ago everyone was saying code is commoditized AGI is here implementation doesn’t matter anymore so which is it? if code truly doesn’t matter why does everyone still care so much about it

emon

Dev retweeted

emon

@emonmeena

Apr 1

we're live now!! @loop2ai gives power to people to automate whatever the fuk they want to on WhatsApp. i know meta will sue me for this but it worth's a shot what say? >>>>> link in bio <<<<<

2:18

452

loop2

Dev retweeted

loop2

@loop2ai

Apr 1

"power to people"

emon

@emonmeena

Apr 1

we're live now!! @loop2ai gives power to people to automate whatever the fuk they want to on WhatsApp. i know meta will sue me for this but it worth's a shot what say? >>>>> link in bio <<<<<

2:18

Dev

Dev

@debuggerdev

Apr 1

building has never been easier i can go from idea → something working in hours now but the moment a few real users touch it everything starts breaking in ways i didn’t expect edge cases i never thought of flows that make sense in my head but not theirs things that work… until they don’t feels like building got easier but keeping something alive is where it actually gets hard