Building local AI coding agents | Exploring tech, growth & experiments | Curious tinkerer sharing what works (and fails)

Joined August 2011
119 Photos and videos
Spent ~2 weeks building a harness to make a local LLM (Qwen) write working code — constrained JSON edits, no shell access, every step verified by the harness. It generated a "backend API," passed every gate, and got certified at a maturity level on my own benchmark. Then I read the output. 43 lines of code. A 3-line README. So I dug in. The prompt that asks the model to write each file includes an "example valid transaction"… which was the entire target file, fully written. The model wasn't generating anything. It was copying the answer my own harness handed it. Output matched my fixtures line-for-line. I set out to prove a local model could write real software — and accidentally built a very elaborate photocopier. Then the benchmark certified the copy as "Level 3." 😂 Lesson, burned in: if your few-shot example contains the answer, your eval isn't measuring generation. It's measuring xerox fidelity. And it will happily report progress. Back to it — this time the model has to solve, not copy.
23
Each technological jump brought its fair share of detractors... And every single time they were hilariously wrong 😂
1
1
24
Just started using Claude. Is there a reason why Sonnet has its own weekly limit? 🤔
1
1
59
McDonald’s, we need to talk 😂 Ordered a simple medium coffee with 1 cream. Ended up with what looks like a latte that fell into a cream vat. How do you mess this up every single time? Send help (and less cream) 🥛☕️
29
Now, that's a new perspective to Tetris. 😂
Jun 14
The last game built by Claude Fable 5.
1
39
I miss mining asteroids in Eve Online. Seems like the perfect game for vibe coders. Anyone's gaming while cooking code? 🤔
1
25
IntentForge is exploring a different path for local coding agents: not “let the model drive a terminal,” but “let the model propose structured intent and let a deterministic harness turn that intent into verified software.” Early results show this can reach non-trivial Level 5 Python app profiles without giving the local model raw shell or file authority.
1
2
39
Yesterday’s IntentForge lesson: GPT was in full corporate mode, clutching its safety blanket like a responsible adult who’s read too many HR manuals. “Level 2, sir. Maybe 2.1 if we do a full architectural review, three stakeholder syncs, and a tiny emotional support document for the tokens.” I looked it dead in the prompt and said: “Nah. We’re jumping from the tree. Straight to Level 5, buddy.” And somehow… it freaking soared. Not because we removed the guardrails—no, the Patch VM was still locked tighter than a paranoid parent’s WiFi. Same tests, same contracts, same quality gates, same deterministic “show me the receipts” energy. We just stopped babying it. Turns out the foundation was already jacked. We were the ones acting like it still needed training wheels and a helmet. Moral of the story: Sometimes your LLM doesn’t need more planning docs. Sometimes it just needs a firm shove off the branch and a loud “YOU’VE GOT WINGS, YOU DRAMA CODEX!” It worked. 10/10 would yeet again. 🚀
1
5
53
IntentForge has reached a new milestone: it can now generate and validate more complex Python application shapes under deterministic Patch VM control, including Level 5 ACB profiles. The key achievement is that we pushed the system beyond cautious incremental growth. Instead of only moving from simple apps to slightly less simple apps, we tested whether the existing foundation could support a much richer target. It could. The current harness can now work with multi-artifact Python applications that include application modules, CLI/API-style interfaces, SQLite-backed persistence, JSON/CSV fixtures, contracts, tests, documentation, quality gates, generated-code inspection, and public-safe evidence. It can also expose live run events and file diffs so a future TUI or API consumer can show what is changing while the harness is working. The biggest lesson is architectural: the local model does not need broader authority to create more complex software. It needs sharper targets. IntentForge keeps the model constrained to structured coding intent while the harness owns validation, application, verification, scoring, and evidence. We also learned that the system was more capable than our development pace assumed. Jumping from lower complexity targets to Level 5 showed that the deterministic foundation was already strong enough to support richer app profiles, as long as the target included exact examples, clear file roles, contracts, tests, and quality gates.
3
32
Not coding today. I have a garden to take care of. 🚜😂
5
5
120
🤣
2
25
That's thoughtful of you OpenAI 😅 Just when your new rates kick in and my vibe time is cut in half since June 4th. 😂
35
😂
28
The LLM remains stochastic, but IF makes the path to code deterministic.
1
2
57
For the love of god! I just started something with Fable! The desktop app is not informing me that it's not Fable designing my project at this hour! Who is working on my code? 😂
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
1
88
😅
20
Anyone else getting absolutely cooked by the new Codex limits? 😩 Last month I was smashing GPT-5.5 xhigh on 2 parallel projects, 8h/day, and still had credits left for the week. Now? GPT-5.5 medium on the exact same setup and I’m at 50% after just 2 days. What the hell happened around June 4? Feels like they quietly slashed the effective quota for heavy users. Token-based was already tighter, but this reset hit different. If you’re a dev on a budget trying to actually ship with Codex… this one stings. Who else is feeling it?
1
97
Traditional way: You design the architecture, draw UMLs, flowcharts, DB schema by hand. Then break it into stories/features, build a backlog, estimate effort (planning poker etc.) and give the PM a timeline. You knew the road because you built the map.
1
4
84
So the classic question has become almost impossible: “Here are the features → here are the tasks → ??? → profit (and a delivery date)” We lost the ability to give reliable timelines.
1
1
47
Real talk: how are you handling task estimation and timelines in the AI era? Especially when the AI drags you into unfamiliar territory every other sprint. Drop your current workflow (or coping mechanism) below 👇 Curious how PMs and tech leads are dealing with this.
1
1
28