6 months ago i started treating claude code like a real coworker, not a chatbot.
now it drafts my prs, runs my repros, writes my slack messages.
i built a system around it: rules, hooks, slash commands, evals.
here's what's inside.
every tool call returns text the model treats as gospel. it's not. it's data from an external system. same threat model as user input. normalize at the boundary or eat the drift downstream.
people keep asking how to give the agent memory. it doesn't have any and it never will. every turn it wakes up blank and reads whatever you put in front of it. the whole job is deciding what gets re-injected. memory isn't a feature you bolt on, it's a file you choose to read back in
my CLAUDE.md doesn't grow every time the agent screws up. one-off mistakes stay in a log. a mistake that repeats becomes a rule. a mistake a script can catch becomes a hook. most configs rot because every correction gets promoted straight to a rule
opus 4.8 dropped and the people asking "is it worth switching" are optimizing the wrong variable. the model stopped being my bottleneck months ago. the hooks, the rules, the session tracking, the feedback loop do the work now. a better model on a sloppy harness is just a faster way to make a mess
every model upgrade reminds me my job changed years ago. I'm not writing code, I'm writing specs for an agent that writes code. The upgrade doesn't change my workflow, it just makes my bad specs more obvious
talking to an AI isn't talking to a human. Different surface, different protocol. Polite for one is mid for the other. Rude for one is signal for the other. Stop projecting social rules onto something that runs on tokens, not relationships
Anthropic just shipped a security plugin. Everyone is talking about the vulns it catches. The actual headline: deterministic file-based rules beat advisory prompt instructions. Now official. Pattern is reusable for anything you want enforced, not just security
i can't help my AI when i'm angry at it
minute 3 the run is wrong but you're locked watching 17 more burn. By minute 12 i'm writing with sarcasm. ALL CAPS. "obviously". "again". As if the model would feel bad
it doesn't. My angry prompt is just mechanically worse, so the spiral compounds
building a hook that catches my own anger markers before i send. Fix is on my side, not its
what's your move when you spiral?
rules that apply everywhere live in ~/.claude/rules. rules that apply only to one client live in <workspace>/.claude/rules. when they conflict, local wins. zero duplication, zero drift across projects.
spent an hour debugging why my agent kept losing its login. root cause: two of my own scripts racing each other. a cleanup job was deleting the browser profile a split second before the agent booted it. the call was coming from inside the house.
Wrote a PreToolUse hook that rejects any file write containing an em-dash. before it touches disk.
20 lines of python. a hard gate, not a polite request in a system prompt.
Prompts are suggestions. hooks are laws.
asked claude to audit its own work setup. instead of agreeing with everything, it told me i had 1657 lines of session log nobody reads, 24 stale tasks, and no evals.
the value isn't that the AI is smart. it's that the system is designed to push back.
6 months ago i started treating claude code like a real coworker, not a chatbot.
now it drafts my prs, runs my repros, writes my slack messages.
i built a system around it: rules, hooks, slash commands, evals.
here's what's inside.
3/ every time i correct claude, i run /violated <description>.
it logs the correction with timestamp canonical case.
after 30 days i review the log. if a correction repeats, it gets promoted to a written rule. or a hook if it's mechanically detectable.
compounding.
4/ none of this is theoretical. it's what i use every day, in production, at a real job.
i'm cleaning up the public version of the repo. dropping it soon.
if you're building something similar, follow drop a reply. happy to compare notes.