@Coldly

@Coldly

60 Photos and videos

Tweets

Pinned Tweet

@Coldly

@Just_Codly

May 18

x.com/i/article/205600481708…

370

457,361

@Coldly

@Coldly

@Just_Codly

20h

A client messaged a founder in Porto at midnight: full competitive analysis, by morning. He didn't open a single tab. He told his team to handle it and went to sleep. In the morning he introduces you to that team like they're old coworkers. "This is Spike. He's a badass." Spike makes sure nothing falls through the cracks. Tara watches Reddit and Twitter all day, listening for people complaining about the exact problem his product solves. Sara writes the cold emails and sends them. Ten went out today. June writes the blog posts. He trained June on how he writes - his rhythm, his words. Five articles are queued. Gary logs into the CRM and runs the campaigns. 1,615 contacts in there now. Bob mirrors all the code to GitHub at 3 a.m., every night, while the founder is asleep. Then he says the part that reframes the whole thing. Spike, Tara, Sara, June, Gary, Bob - none of them are people. They're seven AI agents. They live on his Mac. They run 24/7. His friend asked to meet the team. There is no team to meet. His accountant asked how many he employs. He pays no salaries. He pays an API bill. A founder he knows asked which one of them is the hard worker. He said all of them. They don't take days off because they were never tired. Here's the part builders will sit with. He didn't write the dashboard. He didn't code a single agent. He opened Claude Code and typed one thing: build me a mission control for my business, I want to talk to each of these agents on their own. It built the room. It built the crew. One panel on the dashboard was stalled. The Apify scraper had hit a wall - "top up credits to restore." Even the crew that never sleeps occasionally asks for money. → Headcount on the payroll: 0 → Contacts in the CRM: 1,615 → Backups run while he slept: every night, 3 a.m. → Code he wrote himself: one sentence The office never emptied at the end of the day. It was one screen, seven names, and a man managing people who were never born.

2:06

@Coldly

@Just_Codly

Jun 2

x.com/i/article/206183408555…

@Coldly

@Coldly

@Just_Codly

Jun 12

THE AGENT USES YOUR PASSWORD WITHOUT EVER SEEING IT. That's the part of Anthropic's talk that stuck. The setup they showed: an agent that runs your work - not as "Claude," but as you. It opens your inbox. It posts in your Slack. It works your backlog. When it needs a login, a vault injects the real secret at the exact moment of the request. The agent uses the key. It never learns the key. It acts as you, holds your access, and stays blind to your password the whole time. Then the line that landed hardest: while it sits idle, it "dreams" - re-reads its past runs, rethinks them, and rewrites its own memory before the next task. It works under your name all day. It studies its own mistakes all night. Their framing for where this is going: the bottleneck isn't intelligence anymore. It's the infrastructure around the model. Which is the whole point of the thread below. A model that can check its own work is half of it. The other half is the system you build so it can act - safely, as you, without ever holding the thing that makes it you.

13:24

0xMorty

@0xMortyx

Jun 12

x.com/i/article/206533691716…

@Coldly

@Coldly

@Just_Codly

Jun 12

Everyone is racing to build smarter agents. The real breakthrough is agents that know when they’re wrong. Self-correction beats raw intelligence more often than people think.

0xMorty

@0xMortyx

Jun 12

Anthropic Head of Engineering for the Cloud Platform: "Claude Fable 5 is not just another model drop" "most developers are optimizing for the wrong version of Claude" Anthropic literally said it on stage in Tokyo today This is not a chatbot This is infrastructure that gets better while you sleep Worth more than a $500 agent building course Live from the last Anthropic stage in Japan. Unpublished

39:31

@Coldly

@Coldly

@Just_Codly

Jun 11

FABLE 5 IS FREE FOR ELEVEN MORE DAYS. The most capable model Anthropic has shipped - a rung above Opus - open to everyone until June 22. And look what we're doing with it. Standing in front of it like this cat. Paws on the screen, asking the room: "Trying it today. Any tips?" Stop asking for tips. While you pick a model, someone opened Fable 5 and said one sentence out loud - and it built him a working app, read his own laptop's data live, and flagged a phishing email he never asked it to check. One person. One morning. Work that used to need a team. That's not eleven days to read about it. It's eleven days to build something with it before the meter turns on. Open it. Give it a real job - not a test question. A whole task you'd normally hand to a freelancer. Let it run while you sleep. The cat will move off the keyboard eventually. Move before it does.

0:15

@Coldly

@Just_Codly

May 18

x.com/i/article/205600481708…

@Coldly

@Coldly

@Just_Codly

Jun 11

Most people open a code editor to build one thing. He opened it and built three before lunch. Marcin runs a hotel-booking business and a builder community. He records his screen while he works. This time he was testing the new Claude model. One prompt: a voice assistant that reads his hardware and his accounts. It built a Jarvis dashboard. Live disk usage. Live memory. Wired into his social accounts through MCP. He talked to it out loud. "How are my socials doing?" 48 posts. 9 platforms. 48 hours. Second prompt: clone the Apple Fitness app. He fed it six screenshots. React Native. Expo. A working mobile app from six pictures. Not a mockup. A build. Third prompt: rebuild his hotel site, pixel for pixel. It rebuilt the layout. Then it rewrote the copy without being asked. Better than the original. A studio would quote $40,000 and six weeks for that. A developer, a designer, a copywriter. He did it before lunch, alone, talking to his laptop. Then he asked it one more thing. Find anything important in my inbox. It read his mail. And it flagged one message on its own. "The TikTok verification email is almost certainly a phishing attempt." He never asked it to check for fraud. The building he expected. He's done it a hundred times. The part where it started making calls he hadn't requested — that was new. That's the line nobody talks about. The agent stopped assisting and started deciding. One person. Three products. One morning. Zero salaries. Six screenshots used to be a brief for a developer. Now they're the developer.

11:20

West Lord

@MyWestLord

Jun 11

x.com/i/article/206507335886…

507

@Coldly

@Coldly

@Just_Codly

Jun 11

The phishing thing keeps bugging me. He asked which email mattered. It didn't just answer - it decided one was a scam and told him so. Nobody wrote a "check for fraud" instruction. It just started doing the job of a person who isn't there.

@Coldly

@Coldly

@Just_Codly

Jun 11

24 HOURS AFTER RELEASE. 3 PROMPTS. 2 MODELS. 1 WINNER. The one labeled 4.8 lost. Same prompt. Same time. Two screens. "Build an Apollo flight console. Real toggles. Real telemetry. The 1969 one." Fable 5 shipped it in one shot. Opus 4.8 returned: failed. 2 issues. "Build an asteroid tracker. Near-Earth Swarm. Live data." Fable 5 shipped it. Opus 4.8 returned: failed. "Build a space weather command center. HELIO. Real-time feed." Fable 5 shipped it. Opus 4.8 returned: failed. 2 issues. Three prompts. Three artifacts. One model that ran. One model that retired. The model that came out yesterday made the model from last month obsolete. Stripe used Fable 5 to compress months of engineering into days. 50 million lines of Ruby. One day. Used to take a team two months. Right now this model is free. Until June 22. That's 11 days. The first freelancers are already selling websites built with it for $3,000 a project. Same day delivery. One person, no team. After June 22 it switches to tokens. The free window is your only window to test it on a real client before paying for it. The race used to be between AI and humans. Now it's between AI and AI. The operator just picked the winner. While the team was still in standup.

0:20

Noisy

@noisyb0y1

Jun 10

x.com/i/article/206461791688…

4,652

@Coldly

@Coldly

@Just_Codly

Jun 11

The interesting question isn’t whether Fable 5 is better today. The question is how long any model can stay on top when the gap between releases is measured in weeks, not years?

112

@Coldly

@Coldly

@Just_Codly

Jun 11

If you have Pro, Max, Team or Enterprise - you already have it. 1. Open Claude. Switch to Claude 5 Fable. 2. Three things to test in the first hour: 3. Paste a real client brief. Ask for full landing page with Next.js Tailwind. Watch it ship. Take a screenshot of any app you like. Ask Fable to rebuild it from the image. It will. Give it a problem you couldn't solve last month. Not a demo problem. A real one. The first time it ships something in one prompt, you'll understand why this window matters. 11 days left.

139

@Coldly

@Coldly

@Just_Codly

Jun 9

An Anthropic engineer ran his own AI agent live on stage in London. The agent had the right numbers in its context. Five seconds later it used the wrong ones. The audience watched it fail in real time. Then he said the sentence that should change every AI project running right now. "This isn't a model problem. It's an issue with the information we're surrounding the model with." For two years founders have been chasing the smartest model. GPT-4o. Claude Opus. Benchmarks. Token limits. Wrong race. The smartest model in the world fails 17% of the time if you give it the wrong context. 17% is not a benchmark. It is a refund request. It is a client who never comes back. It is the AI pilot that quietly died last quarter. The people whose agents actually work stopped tuning models. They started tuning what the model can see. Less prompt. Fewer tools. Cleaner context. Then the agent runs while they sleep. The agent isn't the product. The room you put it in is.

29:59

@Coldly

@Just_Codly

Jun 2

x.com/i/article/206183408555…

1,134

@Coldly

@Coldly

@Just_Codly

Jun 10

A lot of teams are still prompt engineering. The best teams are context engineering. That’s where the reliability gains are now.

@Coldly

@Coldly

@Just_Codly

Jun 9

The talk is on YouTube. Search: "Anthropic context engineering London 2026." The failure happens at 14:32. He pauses. Doesn't try to recover. Says the sentence. Moves on. Three people in the front row stop typing.

131

@Coldly

@Coldly

@Just_Codly

Jun 9

Mac Mini or cloud agent? One sleeps when you sleep. The other doesn't. By 6am his Telegram had three deals. He hadn't written them. He hadn't approved them. He didn't even know about them until coffee. Everyone is buying Mac Minis to own their AI. He rented a server and made $11,000 last month. The Mac Mini owners are answering "where does our data go?" He's answering "how did you close three deals before breakfast?" Different question. Different price. While he slept the agent scanned every Product Hunt launch. Found three founders matching his criteria. Drafted three pitches. Sent them at 4am local time so they landed first. By 6am two had replied. By noon one had signed. He hadn't touched a keyboard yet. A Mac Mini sells privacy to clients who ask about data. A cloud agent sells time to a founder who can't be in two timezones at once. One costs $1,999 once and waits for you. The other costs nothing while you sleep and works without you. Mac Mini: models are free. Your time isn't. Cloud agent: subscription forever. Your time is. Mac Mini or cloud agent? Wrong question. The right question is: when do you want the money to come in?

1:30

@Coldly

@Just_Codly

Jun 6

x.com/i/article/206305075962…

2,767

@Coldly

@Coldly

@Just_Codly

Jun 9

Question. Mac Mini guys say "I own my AI." The cloud guy doesn't own anything. He just owns the three signed contracts. Who actually owns more?

125

@Coldly

@Coldly

@Just_Codly

Jun 8

He left for a two-week vacation in August. Didn't tell the system. When he came back, 340 emails had been answered. 12 meetings had been scheduled. 3 client reports had been filed. All in his voice. One client replied: "Great work this week." He hadn't worked that week. 5 agents. 120 skills. 14 integrations. Mac Mini on a desk in a quiet apartment. Running while he was on a beach somewhere. His wife noticed the laptop was open at 3am. "Did you leave something running?" "Yes." "What?" "Everything." She closed the bedroom door. Claude Opus decides. Claude Sonnet executes. Local memory that never resets. It knows his clients by name. Their deadlines. Their complaints. Their preferred tone. It doesn't ask for clarification. It already knows. He built 6 kill switches before he left. One for each agent. One master. His colleague asked why six. "In case one doesn't work." "In case which one?" He didn't specify. He's used zero of them. The system is still running. Right now. It just scheduled something for Thursday. He doesn't remember asking it to.

0:35

@Coldly

@Just_Codly

Jun 2

x.com/i/article/206183408555…

5,507

@Coldly

@Coldly

@Just_Codly

Jun 7

$14,600. Six months. One box. It never asked for a raise. It never called in sick. It never sent the data anywhere. "hello" That's what the screen said when it arrived. Rainbow Apple logo. 1984 case. $89 from a guy on Etsy. Inside: nine Docker containers. Ollama. Gemma4. Qwen. 486 tokens per second. It didn't know it was about to make money. Three agencies found it. Not the machine. The answer it gave. "Where does our data go?" Nowhere. It never leaves the box. Signed. Signed. Signed. Month one: $450. Month two: almost returned. Month three: first agency. Month six: $14,600 total. The machine didn't change. The question did. It's still running. Right now. While you're reading this. The desk is empty. The box is not.

0:24

@Coldly

@Just_Codly

Jun 6

x.com/i/article/206305075962…

13,957

@Coldly

@Coldly

@Just_Codly

Jun 6

Silicon Valley wrote the Mac Mini GTM pitch in 2016. The character called it "fucking stupid." The audience laughed. 2016: lock the algorithm in a metal box. No internet. Isolated from the world. Joke. 2022: ChatGPT. 1 million users in 5 days. Every company wants API access. 2024: enterprises start asking. Lawyers. Doctors. Founders with unreleased code. 2026: Pentagon signs classified AI contracts with eight companies. NVIDIA. Microsoft. AWS. OpenAI. Google. SpaceX. Oracle. Reflection AI. The requirement: air-gapped infrastructure. No internet. Isolated from the world. Impact Level 6: Secret networks. Impact Level 7: Top Secret/SCI. The metal box wasn't a joke. It was ten years too early. "Fucking stupid" is now a line in a defense contract. The box on your desk works the same way.

1:05

@Coldly

@Just_Codly

Jun 6

x.com/i/article/206305075962…

1,386

@Coldly

@Coldly

@Just_Codly

Jun 6

A Mac Mini M4 Pro on a desk made $8,100 last month. The desk is empty. The machine is not. $1,999 once. No subscriptions. No team. No office. 486 tokens per second. Offline. Free after setup. Ollama - Gemma4 - Qwen. Nine Docker containers. Running while he slept. The first agency found him through a referral. Booked a call. Asked the usual questions about models, speed, pricing. Then stopped. "Where does our data go?" He pointed at the box on his desk. They signed before the call ended. The second agency came two weeks later. Same question. Same answer. Same result. By month three he stopped explaining the technology. He just answered the question. Month one: $450. Month three: $2,100. Month six: $8,100. The Mac Mini processed 847 documents last month. Legal briefs. Client contracts. Internal reports. None of it left the building. The cloud never knew it existed. The machine is still on the desk. Still running. The desk is still empty.

0:28

@Coldly

@Just_Codly

Jun 6

x.com/i/article/206305075962…

8,806

@Coldly

@Coldly

@Just_Codly

Jun 6

x.com/i/article/206305075962…

55,454

@Coldly

@Coldly

@Just_Codly

Jun 6

A developer spent $970 on AI in four days. His colleague spent $420 - and billed $3,800 that month. The difference wasn't skill. It was memory. Claude Code starts every session fresh. It re-reads your entire codebase from scratch. Every time. Every session. That re-reading costs money. His colleague built a local Hermes agent instead. Hermes stores your project as a skill. Finish a task once - it saves the process, the structure, the workflow. Next session opens with memory already loaded. No re-reading. No reloading. No repeated context. His colleague's mother called while he was setting it up. She asked what he was doing. "Building memory for a robot." She suggested he go outside. His boss saw the $970 bill. "Which tool are you using?" "Claude Code." "Is it worth it?" He didn't have a good answer. The neighbor heard him talking about it at dinner. "So you spent $970 to do what someone else did for $420?" He nodded. "And you got the same result?" He nodded again. The neighbor went back to his burger. At $970 for four days, Claude Code runs $7,275 a month. Hermes runs $3,150. $4,125 difference. Every month. $49,500 a year. After 30 skills built, Hermes handles similar tasks 35% faster. Less prompting. Less cleanup. Fewer repeated instructions. The workflow that took 90 minutes on day one takes 55 minutes on day thirty. Not because the model got smarter. Because it stopped starting over. He used the saved budget to take on two additional client projects. Competitor analysis. Pricing research. Customer complaint mining. Reports clients pay $300-$500 for. With a trained Hermes workflow: 28 minutes per report. That month he billed $3,800 in client reports alone. The Hermes agent cost less than a single report to set up. It paid for itself on day one. Claude Code spent $550 of his money re-reading files it already read yesterday. Hermes remembered. One tool charges per session. The other charges once and keeps the work. $49,500 a year is not a subscription cost. It's the price of starting over. The setup - 30 minutes, $0 cost - is in the article below.

0:49

Gipp 🦅

@gippp69

Jun 6

x.com/i/article/206284294923…

1,170