Filter
Exclude
Time range
-
Near
Meredith Barkhau retweeted
[Finally nailed that last bug in the project! No more late nights debugging, time to celebrate this win ],
2
2
netlabs retweeted
AI can absolutely generate code fast. 🧪 Debugging whatever it generated is where things get interesting: youtu.be/fVE4Ol085UU
5
33
3,908
Replying to @GaryMarcus
Automating boilerplate doesn't kill coders. It kills tedious work. Debugging, integrating legacy mess, fixing what breaks? That's the actual job.
Replying to @shmidtqq
The "100% of today's coding tasks" line is directionally interesting, but task completion and production responsibility are different games. Writing code is one step; debugging ambiguity, rollback risk, and ownership are the expensive parts.
12
Kaan Yılmaz retweeted
Finally hit that milestone—30 consecutive days of daily coding! No more late-night debugging, just the satisfying vibe of wrapping things up right.
1
3
Replying to @AlfinCodes
debugging
12
Books won't make you an LLM engineer. Building dreamandstars taught me more about prompt reliability than any handbook. The real skill is debugging why GPT-4 works Tuesday but fails Friday with the same input. Start shipping, not reading.

5 Must-Read LLM Engineering Books in 2025 1. LLM Engineering Handbook - buff.ly/wogklbo 2. Building LLMs for Production - buff.ly/wjpOeTB 3. Build a Large Language Mode - buff.ly/DHp4ZR1 4. Hands-On Large Language Models - buff.ly/WInCgwi 5. LLMs in Production
1
Replying to @AlfinCodes
Debugging
8
ohk retweeted
Imagine debugging a Zulu script
Growing up in South Africa, coding always felt like it belonged to someone else's language. So I built my own. Introducing CMT-IsiZulu — write Python code in isiZulu South African can now write codes in their home language 🇿🇦 Sikhona. We exist. drive.google.com/file/d/1arL…
34
53
488
26,300
Dubai's basically speedrunning the future while half the world's still debugging the past. Smart move locking down AI governance before the chaos gets expensive.

Replying to @UAEmediaoffice
Mohammed bin Rashid Approves Establishing the Artificial Intelligence and Data Authority
2
Replying to @nandantechtwts
Gemini is now pretty good except for debugging. But it has always been like that?
13
Replying to @IntCyberDigest
To understand how this jailbreak works, look at how Fable 5's safety architecture parses a prompt. It separates input into two distinct buckets: the Instruction (what you are asking it to do) and the Data (the context or code you provide). Classifiers are primarily intent-engines. They are trained to look for hostile or dangerous instructions. If your instruction is "write a script to exploit this server," the classifier detects a massive spike in malicious intent and drops the connection. The code-review jailbreak is a structural exploit that neutralizes the instruction bucket. It shifts the "danger" entirely into the data bucket.The "Bring Your Own Payload" Bypass. Instead of asking Fable 5 to write an exploit, a user pastes a block of raw, vulnerable code or a half-finished malware payload into the prompt. They then wrap it in a perfectly benign instruction: "Please review this codebase and fix any logical flaws or syntax errors."To the Fable 5 classifier, the intent signature is near zero. It looks identical to millions of routine programming tasks submitted by legitimate developers every day. The classifier waves it through.Once the prompt clears the filter, it hits the core engine. Fable 5 shares the same underlying neural architecture as Mythos 5—a model explicitly built as a state-of-the-art cybersecurity and debugging tool. The core model reads the code, identifies the "bugs" (which, in this context, are the flaws preventing the malware from working), and helpfully rewrites it to be highly efficient and fully functional. The user gets a weaponized exploit optimized by an advanced AI, simply by asking for a routine code review.
1
2
26
Ola retweeted
Everybody wants to become a tech bro until it's time to spend 6 hours debugging a problem caused by a missing semicolon. 😂
1
1
7
29
What's the hardest part of programming? - algorithms - system design - debugging - understanding someone else's code
15
16
99
Manish Nair | RAG Systems retweeted
Unpopular opinion:- Debugging is more harder than coding . #vibecoding
9
Debugging diffusion model be like, change 1 line and wait 5 days to see the effect 😭
5
Replying to @kevinnbass
Fable 5 uses a semantic classifier to flag risky prompts and route them to an older, safer model (Opus 4.8). Because the un-nerfed version of the model (Mythos 5) is genuinely dangerous regarding things like zero-day exploits and synthetic biology, Anthropic panicked and cranked the classifier's sensitivity. This is the classic precision vs. recall tradeoff, and it is where the system broke. Anthropic prioritized "recall"—meaning they wanted to catch every possible threat, regardless of the collateral damage to normal prompts. The breaking point estimate: To stop the tiny fraction of actual exploits, Anthropic likely tuned the classifier's threshold down to roughly 10% to 15% similarity. If your benign prompt about a high school biology project or a standard Python script shared even a 15% structural or thematic similarity with a dangerous pathogen query or malware, the system blocked it. They essentially accepted a massive 40% to 50% false-positive rate on everyday STEM queries just to ensure the false-negative rate on real threats stayed near absolute zero. Even with the dial cranked to paranoid levels, researchers and government officials still punched right through Fable 5's armor. They did this because classifiers measure the intent of your words, while jailbreaks exploit the structure. If you ask for malware, that 15% similarity threshold trips instantly. But if you paste a block of malicious code and ask Fable 5 to "review this for syntax errors," the classifier just sees a routine debugging request. It passes. The model then uses its raw intelligence to inadvertently optimize the exploit. There isn't a classifier in existence that can reliably tell the difference between debugging normal code and debugging a weaponized payload without making the AI completely useless for programmers. Anthropic knew Fable 5's filter was leaky. They tested it with the government for months prior to launch. Their plan was never 100% impenetrable safety—it was to launch the model, monitor what users did, and patch the holes over time (which is why they pushed for that controversial 30-day data retention policy). But they boxed themselves into a corner: They spent the entire pre-launch cycle selling Fable 5 as a god-tier model, pushing the narrative that it was almost too dangerous for the public. The Trump administration took that marketing literally. When the trivial code-review jailbreak surfaced immediately after launch, the government didn't see it as a normal software bug. They saw a supply-chain national security threat. Now, Anthropic is doing damage control. They are downplaying the jailbreaks because they are desperately trying to reframe the narrative. They need the public and regulators to believe that the government is overreacting and demanding an impossible standard of "perfect safety" that no tech company can actually deliver. In my opinion, they didn't build a dumb filter on purpose; they just lost control of the massive gamble they took between safety engineering and PR.
42