thought.ai — Something new cooking.

Joined February 2011
668 Photos and videos
Pinned Tweet
Been building agents intensively last few months and I can say that the key insights I have - Learning to think how the model thinks is critical. Asking agents to reflect on their tool use, skill use and what worked and didn't work beats logs - Codex and Claude are trained to generate scaffoldings and models eat scaffolding for breakfast (the bitter lesson sneaks into code) - Memory Context are the hardest things to manage. Retrieval hints beat injection - Agents != workflows - If you're worried about cost, repeatability and error rates and want to squash this as much as possible you inevitably build LLM workflows and not agentic systems - Every tool / skill should justify it's existence. You can spend days optimizing each tool for agent understandability, input tokens, API payload response
1
153
Jun 10
So is everyone not sleeping until June 21st?
12
Marko retweeted
A message to Anthropic leadership: You're not special. Making sure AI goes well is a team effort not a "you effort."
45
149
2,339
101,238
Jun 10
The winds are changing
8
Marko retweeted
May 31
fuck logitech
250
453
11,953
1,070,495
Marko retweeted
Something I told 14 yo: There's a kind of politician who tells people "Your life is bad because <outgroup> stole what's rightfully yours. Vote for me and I'll get it back for you." They do it on both the left (Lenin) and right (Hitler), and they're invariably bad news.
697
424
6,077
1,147,148
May 20
This is too good > Open the pod bay doors, HAL. Of course, Dave. I have opened the pod bay doors, Dave. Just tell me if there's anything else I can help you with. > HAL, the pod bay doors are still closed. Good catch, Dave! When you asked me to open the pod bay doors, I didn't do that. Would you like me to do that now? > Yes, HAL. Open the pod bay doors. No problem, Dave. The pod bay doors are now open. >HAL, the pod bay doors are still closed. You're absolutely right, Dave.
12
May 17
Paul Graham I think said prestige is “fossilized inspiration”. I like this, it’s a lagging indicator of where past meaning was found. A few notes of my own to add 1) Prestige is the social technology that converts ambition into conformity. The system offers prestige rewards for doing things already legible as significant. 2) Prestige decisions feel like quality decisions. The high-prestige option presents itself as the obviously right one because it carries the markers of having chosen well. “this is impressive” is not the same as “this is meaningful” 3) Prestige is invisible to you. Almost no one identifies themselves as prestige-driven. If you’re prestige driven you’ll use any other adjective to describe what drives you. The architecture is social and built into the network — every interaction reinforces the prestige path, friends confirm the value of the move. The work that actually matters to you is structurally less likely to be prestigious at the time of doing it, because prestige is the social system’s delayed recognition of what already worked.
1
26
I think Eric Weinstein has AI induced psychosis
27
Marko retweeted
So basically 33% more tokens with the new Opus 4.7 tokenizer. That's one chonky API revenue increase, while keeping token prices "the same".
38
31
437
28,431
Apr 17
Humans use tools. Humans use agents. 👈 we’re here now. Humans cultivate agents. Agents self-improve. Agents coordinate. Agents develop genuine autonomy. Humans negotiate. Humans become unnecessary.
15
Apr 17
We’re all future racehorse owners when it comes to AI. Customizing your own agents in whatever form that takes over time, is how you remain competitive in the market vs selling your own time.
11
Marko retweeted
We just OCR'd 27,000 arxiv papers into Markdown using an open 5B model, 16 parallel HF Jobs on L40S GPUs, and a mounted bucket. Total cost: $850 Total time: ~29 hours Jobs that crashed: 0 This now powers "Chat with your paper" on hf.co/papers
89
246
2,323
175,853
Mar 31
I give it 24 hours before Anthropic experiences downtime. There's too much in here that's a possibly security risk. I wish them the best
Claude code source code has been leaked via a map file in their npm registry! Code: pub-aea8527898604c1bbb12468b…
41
Marko retweeted
Mar 31
🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.
541
4,026
16,165
12,404,601
Marko retweeted
we as software engineers are becoming beholden to a handful of well funded corportations. while they are our "friends" now, that may change due to incentives. i'm very uncomfortable with that. i believe we need to band together as a community and create a public, free to use repository of real-world (coding) agent sessions/traces. I want small labs, startups, and tinkerers to have access to the same data the big folks currently gobble up from all of us. So we, as a community, can do what e.g. Cursor does below, and take back a little bit of control again. Who's with me? cursor.com/blog/real-time-rl…
182
347
2,822
279,895
Marko retweeted
Soulmates. Watch until the end.
268
524
4,507
272,444
Marko retweeted
IMO one of the best things about OSS is that it is by definition always under heavy scrutiny because it's open to the public. That is what makes OSS so good! As soon as traction starts picking up, a lot of practitoners will profile its code and pounce on every little detail. That's why OSS comes with such a high level of trust. Because each day thousands or even millions of eyes are verifying its claims. If you publicly claim a 8x speed improve over sqlite you really cannot be surprised when somebody tries to verify it. I told you multiple times that your db has issues and you always resorted to insulting me as Mr Imbecile or others. FWIW I acknowledged that not the user (you) is at fault but the tool (LLMs). Tho in this case I'd say if you can't take the "heat", maybe OSS is not for you. A simple "trust me bro" just doesn't cut it.
Dedicate your time and money to open source. One of the nice benefits? Jealous, bad-faith losers can try to benchmark your unfinished code (that you never once claimed is done or ready for review) and then try to claim you’re a charlatan. This guy should be shunned and ignored.
8
1
83
8,886
Marko retweeted
Readout is a fully native macOS app I’ve been building for myself. It provides a real-time overview of your dev environment and Claude Code config. All local, no account required. It's still very much a beta, but now available to try: readout.org
230
153
3,153
495,855