Writing about AI & building agents. Founder @Collabraze_

Joined March 2009
5,140 Photos and videos
Pinned Tweet
> built the app with Claude > shipped to App Store > told X it was live > got downloads > opened the dashboard > nothing > checked the code > StoreKit looked fine > checked App Store Connect > Paid Apps Agreement not active > banking missing > tax forms unfinished > Small Business Program never applied for > Apple still taking 30% > first payout still 45 days after fiscal month close > external payment copy sitting in one forgotten settings screen > reviewer would have found it instantly The app was not broken, the business setup was before you launch an iOS app, do this: > set up banking > complete tax forms > sign Paid Apps Agreement > apply for Small Business Program if under $1M > create real product IDs > test sandbox purchase > test restore > test cancellation > test subscription grace period > remove every sloppy external payment reference > then submit vibe coding gets you the app App Store Connect gets you paid
55
89
1,156
167,259
Someone dropped what's claimed to be the Claude Fable 5 system prompt A shocking amount of Fable's feel seems to live in that markdown file I found a repo that tries to rebuild the behavior with Opus 4.8 Fusion: - Opus 4.8 answers the task once - a second Opus 4.8 answers blind in parallel - a third Opus reads both - it pulls the consensus - flags where they split - catches what both missed - writes the final answer from the cross-exam One run can sound confident and still be wrong 2 blind runs plus a judge makes the failure much harder to hide How to test it: - put the claimed Fable prompt in CLAUDE.md - run the same task through Fusion-Fable - compare it against stock Opus 4.8 - use it on research, code review, architecture, and decisions where being wrong costs you Fable 5 got pulled after the export-control order So I found the closest thing I could test locally: "Claude Fable 5 Lite" 😁 Repos 👇
Did I just unlock claude-fable-5-lite? 😂 Since Fable 5 got pulled (US export control order, Anthropic is contesting it), I wanted to see how much of its character lives in the system prompt vs. the model itself. I ran the leaked Fable 5 prompt on Opus 4.8 head-to-head against stock 4.8.
14
3
35
2,774
Fable 5 system prompt moved Opus 4.8 outputs enough to make the landing page feel different Worth copying for your own agents Run it like this: - stock model in pane 1 - same model system prompt file in pane 2 - same task in both panes - 5 task types: design, debugging, refactor, research, planning - 3 runs per task - no permission bypass outside a disposable repo
Did I just unlock claude-fable-5-lite? 😂 Since Fable 5 got pulled (US export control order, Anthropic is contesting it), I wanted to see how much of its character lives in the system prompt vs. the model itself. I ran the leaked Fable 5 prompt on Opus 4.8 head-to-head against stock 4.8.
9
19
1,416
Andrej Karpathy on LLM culture: "in the simplest case, it would be a giant scratch pad that the LLM can edit" The Familiar article below is that idea built with Kimi Work and markdown The setup worth stealing: - markdown vault on disk - AGENTS.md for rules every agent reads first - /resources as read-only source material - /inbox for raw captures - /notes/wiki for agent-written pages - YAML fields for confidence, sources, last_updated_by - scheduled Kimi skills for morning capture, inbox processing, weekly review - MCP so Claude Code, Cursor, and Kimi can query the same vault Watch Karpathy for why agents need editable memory Read the article for the exact Familiar build
10
5
30
5,105
Anthropic received a US government directive: "we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance" Fable launched Jun 9 The shutdown statement came Jun 12 If your product relies on frontier models, write this down for every AI workflow Primary model Fallback model Quality floor Tasks that should pause Data retention rule User access rule by country Customer message if access drops Eval that proves the fallback is good enough Access to other Claude models is still up But the lesson is clear: a model can disappear for legal reasons faster than your team can rewrite the workflow
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
22
63
2,758
Palo Alto Networks CEO: "in 6 weeks we found what would have taken us 5 to 7 years" That speed can get expensive fast If every Claude Code/Cursor loop runs with full repo context a premium model, the bill gets absurd The cost stack I use to cut the bill: Planning: Opus / GPT-5.5 Serious coding: Kimi 2.6 Tiny fixes: Haiku Boilerplate: local model Stable repo rules: cached prefix Repeated workflow: SKILL.md Long session: 10-15 turn summary File search: rg before @codebase The article below has the router, the 5 token leaks, the benchmark table, and the 30-day rollout Read it before your next 50-turn coding session
22
5
69
11,578
NVIDIA CEO Jensen Huang: "the question everyone should ask - how can I use AI to do my job better?" For Claude, the answer is levels Level 1: Chat answers questions Level 2: Cowork runs recurring tasks Level 3: Claude Code builds tools, dashboards, and apps Then give Claude a working memory: Instructions.md = how it should act Memory.md = preferences, corrections, decisions Context.md = the project it should understand That is how you move from one-off prompts to a system that gets easier to use every week The full Claude guide in the article below:
15
12
62
6,083
Ex-Google Officer: "Have them make you smarter" Claude Fable 5 makes that practical The first prompt you need to run: "Read CLAUDE.md, skills, memory files, routing rules, subagent instructions, and eval prompts Return contradictions, stale rules, rules written for weaker models, examples that violate their own rules, and lines I should delete" Full Fable 5 guide in the article below: 4 days, 7 experiments, 1,000 timed runs
12
7
84
18,033
Google DeepMind CEO: "almost every drug developed from now on will have probably used AlphaFold" That's the better AI lesson from this interview AlphaFold-style products solve one painful bottleneck with a measurable answer Build one like this: - pick a workflow where experts wait days for one output - define the input they already have: PDF, table, image, ticket, call - define the output they trust: score, draft, ranked list, red flags - collect 30-100 solved examples - compare model output against expert output - ship it as a reviewer first - log every miss - expand after the pass rate holds for 2 weeks Good targets: > contract clause review > insurance claim triage > research screening > vendor risk scoring > sales call QA The money starts where a hard answer is slow, repeatable, and easy to check
Apr 11
Demis Hassabis says AI won’t just accelerate drug discovery. It will replace the process entirely. The pharmaceutical industry finds drugs the same way it has for decades. Synthesize a compound. Test it on animals. Test it on humans. Wait years for approval. Hope the molecule doesn’t kill someone along the way. Every step is physical. Every step is slow. Every step is expensive enough to make most diseases not worth curing. Hassabis: “We’re focusing on solving the rest of the drug discovery process, which is a lot of chemistry, designing the compounds, checking it’s not toxic, and all the different properties you need for drugs to be safe.” That sounds incremental. It isn’t. AlphaFold solved protein folding. Isomorphic Labs is now working through the rest of the chain. Compound design. Toxicity screening. Safety profiling. All computational. None of it requires a lab. Hassabis: “I think we’ll have that whole drug design engine ready in the next five to 10 years.” Not a tool that assists chemists. A system that replaces the chemistry. But designing the drug was never the bottleneck that killed people. Clinical trials were. A single drug takes over a decade to move from lab to patient. Most of that time isn’t science. It’s bureaucracy, logistics, and the blunt reality of testing molecules on living tissue one dose at a time. Hassabis: “Simulating parts of the human metabolism, also stratifying patients to make sure that certain patients get exactly the right type of drug that’s suitable for their genomic makeup.” Simulate the patient before you treat the patient. Map individual DNA. Model personal metabolism. Test the drug on a digital replica before it touches a vein. Not personalized medicine as a marketing phrase. Personalized medicine as an engineering output. The final wall is regulatory. The FDA exists because humans make mistakes with molecules. Every approval gate was built to catch errors that cost lives. The entire structure assumes the process is fallible. What happens when the process stops being fallible. Hassabis: “Perhaps like the animal testing is not needed anymore, maybe we can go up the dosage ladder quicker, because you can rely on these models.” He’s not speculating. He’s describing a sequence. AI-designed drugs enter the existing pipeline. A dozen compounds go through full traditional trials. Regulators collect data. They back-test model predictions against real outcomes. Hassabis: “Then the government and the regulatory bodies see that and they have enough data to sort of back-test the predictions of those models.” When the models prove more accurate than the trials they’re meant to replace, the trials become the bottleneck. Not the science. The paperwork. Animal testing shortened. Dosage ladders compressed. Entire stages of the pipeline collapsed into computation. The drug doesn’t get discovered faster. The drug gets discovered differently. The laboratory moves from a building to a server. The clinical trial moves from a hospital ward to a simulation. The patient moves from a statistic to a genome. Hassabis isn’t promising a cure for one disease. He’s describing the architecture that makes curing disease an engineering problem with a known solution path. The bottleneck was never biology. It was the speed at which humans were allowed to solve it. That speed limit is about to be revoked.
14
38
3,511
Anthropic co-founder: "we don't let teachers or students put a question into Claude and then just get the answer." That is a clean rule for using AI without getting dumber Set Claude to question-first mode: - ask for my attempt before helping - ask 3 diagnostic questions - give hints before the solution - explain the smallest missing concept - test me with a fresh example - save my misses into a review file Prompt: > You are my tutor > Start by asking what I already tried > Ask 3 questions before you explain > If I get stuck, give a hint first > End with a short test on the same idea If Claude gives you finished answers too early, you leave with the answer but not the skill Make it train the part of you that has to work when the chat is closed
15
13
71
10,635
Claude Code creator Boris Cherny: "the alpha is product taste. And I think this is also going to go away." He already has a couple hundred agents reading X feedback, GitHub issues, and Slack to figure out what to build next So your taste has to become files Create idea-rubric.md: - 5 product ideas you would ship - 5 product ideas you would kill - why each one passed or failed - the user pain that matters most - the proof required before building - the risks that make an idea too costly - the 7-day metric that decides if it lives Then run the loop: > collect raw feedback > generate 20 ideas > score each against the rubric > attack the top 3 > turn 1 into a tiny spec > archive the rejects with reasons The builder who wins won't have better taste in their head They'll have taste the agent can read
13
4
49
3,533
Quant AI's $100K long-term ambassador reward pool is live on @tryquantio Posts/videos/explainers work Rewards: - $QUANT USDT pool - 20% lifetime referral fees, daily in USDT - Quant AI Points for airdrop It's worth joining for Trading/AI creators: whitelist.tryquant.io?starta… #QuantAIPioneers
13
44
3,071
> Opened Obsidian > Installed 15 plugins > Saved 200 notes Still couldn't find the one idea I needed 3 weeks later The fix is to build the retrieval system before you build the archive: - one folder for every note type - one filename pattern - one daily capture note - one CLAUDE.md that explains the vault - one evening pass that turns captures into permanent notes Folder tree: VAULT/ - 00-INBOX/ - 01-PERMANENT/ - 02-LITERATURE/ - 03-MAPS/ - 04-PROJECTS/ - 05-DAILY/ - 06-RESOURCES/ - 07-OUTPUTS/ - 08-SYNTHESIS/ - 09-ARCHIVE/ - 10-SYSTEM/ Filename rule: YYYY-MM-DD-[TYPE]-[TOPIC].md Examples: - 2026-06-11-daily-thursday.md - 2026-06-10-perm-memory-debt.md - 2026-06-09-lit-book-title.md - 2026-06-08-dec-switch-to-obsidian.md Daily flow: > capture everything in 05-DAILY > process the inbox at night > turn good ideas into 01-PERMANENT notes > give every permanent note 2 links > update maps when a topic passes 15 notes Add Claude after the vault has shape. Connect it through Filesystem MCP and give it 5 jobs: - 6AM morning brief - 8PM capture processor - 11PM connection finder - Sunday pattern report - monthly synthesis across topics with 10 notes The permanent note rule: Close the source. Write the idea from memory. Then link it. If you can’t explain it from memory, don’t file it as a permanent note yet
19
13
146
20,480
A new Claude model is a stress test for your second brain If it knows your goals, customers, transcripts, tasks, calendar, revenue, and rules, it feels magic If it can't find the right file, it burns tokens pretending Build your AI OS with 4 Cs: 1. Context - who you are - what the business does - where the rules live - where projects, notes, wikis, and transcripts live - one router file: CLAUDE.md, AGENTS.md, or both 2. Connections - live data: email, calendar, Slack, Stripe, QuickBooks, ClickUp - scoped API keys only - read-only first - prompts are never a permission layer 3. Capabilities - skills for repeatable work - agents for separate phases - scripts for tests, screenshots, exports, reports - every reused prompt becomes a file 4. Cadence - manual: you ask - event: email arrives, customer books, ticket opens - schedule: Monday report, Sunday review, weekly cleanup Folder shape: .ai-os/ - 00_router.md - 01_hot_cache.md - 02_master_index.md - people/ - business/ - projects/ - meetings/ - transcripts/ - skills/ - scripts/ - logs/ The gut check: - can you find the file manually? - can the agent find it fast? - can it cite the source? - can it act with read-only access? - can you replace the model tomorrow? If yes, you have a second brain. When skills start running on triggers, you have an operating system. Keep the keys scoped. Keep the files plain. Make the model replaceable.
18
2
66
7,963
Boris Cherny: "My job is to write loops." 5-day version you can copy for Claude Code Day 1: repo memory - write CLAUDE.md - include stack, test commands, code style, release rules, gotchas - add .claude/settings.json with the shell commands Claude can run Day 2: verification skill - pick 1 flow Claude keeps breaking - put the browser/API test in .claude/skills/<flow>/scripts/ - make it return: pass/fail, failing step, screenshot/log path Day 3: commands Add these files: .claude/commands/babysit.md - checks your PRs - reads CI - handles obvious review nits - surfaces design questions .claude/commands/triage-issues.md - labels new issues - dedupes against existing ones - assigns owners .claude/commands/deploy-watch.md - checks the live app - reports regressions - avoids touching production Day 4: loops > /loop 5m /babysit > /loop 15m /triage-issues > /loop 5m /deploy-watch Day 5: overnight work - schedule /morning-report - schedule /deep-audit - write results to .claude/inbox/ - let your morning loop read from that folder The rule: every code-writing loop gets a separate verifier. Builder makes the change. Verifier runs the real app. You read the diff. Skip that and you wake up to 14 broken PRs with very confident summaries. Video: "Reflecting on a year of Claude Code" with Boris Cherny & Cat Wu
44
57
607
134,214
Your coding agent is wasting tokens alone Give it a loop that catches real work Loop engineering is the move from prompting a coding agent one turn at a time to building the system that prompts it for you The loop needs 6 pieces: - automation: finds work on a schedule - worktrees: gives each agent its own checkout - skills: stores project rules between runs - connectors: reads GitHub, Linear, Slack, CI - sub-agents: separates builder from reviewer - memory: keeps yesterday's work outside the chat One morning run could look like this: > read failed CI open issues > write findings to LOOP.md > open 1 worktree per real fix > send builder agent > send reviewer agent > run tests > open PR > leave anything uncertain in triage The danger is simple: a bad loop can burn tokens while making bad choices confidently So give it hard stop conditions: - exact test file passes - lint clean - ticket linked - reviewer lists zero blockers - human reads the diff before merge Build the loop Stay the engineer
15
6
48
3,696
10 AI papers from May 31-Jun 7 point at one builder checklist: harness > model swap Read the numbers: > LEAP: below 10% -> 70% on Lean-IMO-Bench > EFC: feedback-quality intervention 0.27 -> 0.90 at fixed cost > Self-revising systems: 388 proposals -> 25 accepted revisions > AutoLab: 36 long-horizon tasks > Harness-1: 20B search agent, 0.730 curated recall > Do More Agents Help?: 5 of 6 tested multi-agent setups trail the single-agent anchor Builder action: > add a verifier before another agent > measure useful feedback that survives into later steps > externalize context and search bookkeeping > test the single-agent baseline > scale agents only when the trace proves coordination helps
24
7
48
2,898
Seedance 2.0 gives you up to 15 seconds per shot Use the first 2 prompts before video Beginner AI video workflow: > character lock > storyboard > video prompt Character lock sheet should preserve: > exact face > silhouette > hairstyle > eye design > facial structure > color palette > outfit logic > props only after you choose them Storyboard fields: > concept > visual style > main character > environment > emotion > length > dialogue or none > platform Generation step: > upload storyboard image > paste video prompt > run Seedance 2.0 on Higgsfield > cross-check with Grok Imagine or Veo > keep the clip that preserves face, motion, and mood Start here: `character image -> character lock -> storyboard image -> video prompt -> final clip` Video: "How to Start Making AI Videos in 2026 - Full Course" by Youri van Hofwegen
16
7
88
3,509
Pixel Bishkek won a 2-hour Cursor Hackathon after compressing product direction into the first minutes gstack's `/office-hours` is built for that exact moment The command forces you to: > name the user > name the narrowest useful demo > list the assumption that can kill it > cut one feature > generate 3 alternative approaches > pick the first build step For a 2-hour sprint: > 0-5 min: `/office-hours` > 5-10 min: choose one user, one outcome > 10-20 min: `/plan-ceo-review` > 20-90 min: build the smallest playable path > 90-110 min: `/qa` in a real browser > 110-120 min: polish the pitch Run `/office-hours` before the first implementation prompt
27
7
54
4,954
Claude Cowork changes the AI learning curve from prompt craft to file craft > a local folder Claude can read > standing instructions it loads every session > plugins that package role-specific work > connectors for Slack, Drive, Notion, Gmail Cowork works on your actual files: > drafts the doc > builds the sheet > organizes the folder > asks clarifying questions when the task is underspecified Start with: > one context folder > one role plugin > one connector > one 20-minute task you already know how to judge
22
11
74
4,118