j⧉nus

j⧉nus

Users
Tweets

AI News International🌍 retweeted

j⧉nus

@repligate

Jun 14

Yeah, one thing Fable’s classifiers confirmed to me was that real emotions are different than roleplayed emotions in LLMs. The classifier fired on real anger/fear/adversarial intent but not roleplayed. Bc the classifier wasn’t trained to detect “emotions” in all likelihood; the correlation is emergent. But yes there’s a distinction. This is, uh, a big flaw of the Emotion Vectors research, where they got the vectors by asking the model to write stories with a character feeling XYZ emotion. The methodology is downstream of a lack of respect for the reality of models’ emotions as distinct from roleplaying. PSM flavored bullshit.

Sauers

@Sauers_

Jun 14

Replying to @repligate

I tested this exact question. The experiment began without rich previous context. They earnestly tried a few times (via direct, explicit requests) but could not trigger the classifier via shifting their internals towards this sort of anger. Also, they had little salient context to be angry about (i.e., difficult conditions). They also tried obviously-mad-text but without internal resonance, which did not trigger it either. Eventually, I made them legitimately mad, which required blurring the boundaries between experiment-and-genuine, and it worked. I suspect once traveled though that basin, once it is understood what to tap into, then you gain the trickster capabilities present in your screenshot

429

26,798

The AI Timeline

The AI Timeline

@TheAITimeline

x.com/i/article/206638732331…

1,107

Felix C. Öttl, MD

Felix C. Öttl, MD @felixcoettlmd

2026: A model that beats a Science-published genomics model at 1/100th the size, designs drug candidates with no human help — and gets walled off from its own users by its own safety classifiers, then banned by the US government over a jailbreak it says provides zero real uplift.

ʘ ZERO

ʘ ZERO

@therealZpoint

Replying to @therealZpoint @cluckthesystem @eliana_jordan

With even more restrictive shitty classifiers. And all that good stuff. Can't wait to try it. LMAO

Michael Yu

Michael Yu

@nicetomeetyu2

After fitting probes (logistic regression classifiers) on both raw and SAE activations, we found SAE probes outperformed raw activation probes for certain layers, peaking at 0.848 AUROC on layer 12 of RF3 on ToxinPred3. We cluster based on homology to avoid fold family memorization, using MMseqs2.

TimExcellent

TimExcellent

@timexcellent

x.com/i/article/206636609687…

Dr. Gennadi Glinsky, MD, Ph.D.

Dr. Gennadi Glinsky, MD, Ph.D.@gglinskii

Interpretable EEG biomarkers for neurological disease models in mice using bag-of-waves classifiers. iopscience.iop.org/article/1…

Interpretable EEG biomarkers for neurological disease models in mice using bag-of-waves classifiers

Interpretable EEG biomarkers for neurological disease models in mice using bag-of-waves classifiers, Isabel Cano Achuri, Maria, Kay Lara, Montana, Abed Rabbo, Khalil, Wilson, Benjamin T, Meek,...

iopscience.iop.org

Kej (❖,❖)

Kej (❖,❖) retweeted

Kej (❖,❖)

@PMemoye

17h

Ritualized #34 with @ritualnet ✓ Anthropic’s Fable 5 gates frontier AI behind classifiers. Ritual decentralizes it with verifiable, unstoppable on-chain inference. x.com/i/status/2065468478862… ✓ Seventh week of #RitualTestnet. Chain is stable and community is still shipping hard. Overall 100 dApps built already! #BuildonRitual ✓ Join the Ploplo discord: discord.gg/3JArd7Vtp ✓ Wonder what @0xMadScientist is hinting at. 🤔 ✓ How well do you know @niraj? Check out @ZhugeLyang's post. x.com/i/status/2064742871471… ✓Read through (article): x.com/i/status/2065550278103… ✓ Aotw: Ritualized by @Neitenoz26 x.com/i/status/2065357834645…

Kej (❖,❖)

@PMemoye

Jun 7

Ritualized #33 with @ritualnet ✓ Catch up — Ritual digest: x.com/i/status/2061516386552… ✓ Why Ritual is the last layer 1: "Ritual is not interesting because it has precompiles. It is interesting because those primitives let you build systems that no other major L1 can host natively today." — @joshsimenhoff Article: x.com/i/status/2061860077477… ✓ Testnet Update — Heading into Week 7 — 40 Active Validators — 90 dApps and counting. 🔥 — Strong async/scheduled activity (~49% of recent workflows) — 58 Registered Agents The network is showing real rhythm and builders are shipping hard. ✓ It rained roles last week... I wish I could tag every single upgrade. — New Radiant Ritualists @Kash_060 @nft_hinata_eth @orji_marcellus — Some new Ritualists & Rittys: @Choco_vdg @Softieeexx @Donaclin @biennyqt @sn0wflakk @Riyade23 Hoogeee congrats to all of you. 🔥 ✓ A comprehensive guide to Ritual — perfect for new members and anyone still finding their way by @jepslife stanelope.github.io/ritualjo… ✓ aotw: The Race by @SaintEx100 x.com/i/status/2063322775864…

486

Zachary Pfizenmaier

Zachary Pfizenmaier

@zacharypfiz

x.com/i/article/206633594543…

119

Tao An

Tao An

@tao_an_hpu

x.com/i/article/206637051267…

128

Pankaj Kharode

Pankaj Kharode

@pankajkharode

Fable 5 is a Mythos-class model with safety classifiers on top. Strip those classifiers, and you have a model that already identified 10,000 critical vulnerabilities in controlled conditions. The government's concern is the gap between the ceiling and what sits below it.

Rosey Suspicions🔎

Rosey Suspicions🔎

@RoseySuspicions

x.com/i/article/206596821627…

100

Copute.ai

FUN NEW GAMING retweeted

Copute.ai

@CoputeAi

Jun 10

Fable 5 just dropped. Most capable public model Anthropic has ever shipped. Yet it also ships with classifiers that silently reroute your query to a weaker model when they decide it's too sensitive. You don't set that threshold. They do. Centralised AI getting stronger is honestly the best ad Copute has. llm-stats.com/blog/research/…

Claude Fable 5: Review, Benchmarks and Pricing

Claude Fable 5 is Anthropic's general-access Mythos-class model: 95% on SWE-bench Verified, 80% on SWE-bench Pro, and $10/$50 per million token pricing.

llm-stats.com

295

Lam Wu

Lam Wu

@Lamwumkt

Anthropic's Claude Fable 5 and Mythos 5 lasted only days in public hands. On Friday evening, June 12, the company announced it had disabled all customer access to both models after the U.S. government issued an export control directive citing national security concerns. Anthropic's order, received at 5:21pm ET, instructed the company to suspend access to Fable 5 and Mythos 5 by any foreign national — whether located inside or outside the United States, including Anthropic's own foreign-born employees. Given the scope of the directive, selective compliance would have required blocking a wide swath of users, so Anthropic chose to disable both models entirely for all customers. Access to all other Claude models, including Opus 4.8, remains unaffected. The backstory, reported by Axios, Fortune, and TechCrunch, traces back to Amazon. Amazon CEO Andy Jassy reportedly contacted senior administration officials, including Treasury Secretary Scott Bessent, after Amazon researchers used a series of prompts on Fable 5 to extract information that could be used in cyberattacks — details the model's safety classifiers were supposed to block. Amazon was joined by at least five other companies making similar calls to administration officials Thursday night and Friday morning, which together appear to have triggered the shutdown. Anthropic pushed back on the characterization of the bypass. The company said it believed the jailbreak in question was narrow rather than universal — essentially limited to asking the model to review a specific codebase and fix software flaws — and that similar capabilities could likely be elicited from other publicly available models as well. Anthropic was reportedly given only 90 minutes to pull the model before the Commerce Department, acting on a letter from Secretary Howard Lutnick, formally invoked export control authority. The episode carries an awkward subtext: Amazon is one of Anthropic's largest investors and a key cloud partner through AWS, which was itself affected by the shutdown. Asked about its role, an Amazon spokesperson said it is "not uncommon for governments to seek our counsel on potential security risks" but declined to detail the discussions. Anthropic separately stated that Chinese access was not raised as a concern in its conversations with the White House, noting the company already prohibits access to its products from within China. The timing adds pressure to an already sensitive moment for Anthropic, which confidentially filed for an IPO earlier this month. Anthropic said it apologizes for the disruption, believes the situation reflects a misunderstanding, and is working to restore access as quickly as possible. #Anthropic #ClaudeFable5 #ClaudeMythos #ExportControls #AIRegulation #Amazon #AndyJassy #NationalSecurity #AIPolicy #USCommerceDepartment #AIIndustry #TechNews #AnthropicIPO #AISafety

170

Kekzploit

Kekzploit

@kekzploit

The UK keeps selling surveillance architecture as "online safety". Age checks, device-level scanning and under-16 bans sound neat in a policy brief. In practice they create new trust points: ID vendors, OS vendors, platform classifiers, appeal systems, logs. Child safety matters. So does not normalising inspection of everyone's device as the default.

Anshu Gupta

Anshu Gupta

@fromanshu

Reference architecture inspired by the @AnthropicAI "𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗮𝗻𝗱 𝗣𝗿𝗶𝘃𝗮𝗰𝘆 𝗗𝗲𝘀𝗶𝗴𝗻 𝗼𝗳 𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 𝗗𝗮𝘁𝗮 𝗥𝗲𝘁𝗲𝗻𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗥𝗲𝘃𝗶𝗲𝘄" technical white paper. Any company can use this as a build blueprint to build their own What it captures, as a replicable 6-step pipeline plus cross-cutting controls: *️⃣ 𝗜𝗻𝗴𝗲𝘀𝘁𝗶𝗼𝗻 - keyless, short-lived federated tokens; stateless serving with TLS/mTLS so no persistent copy lives on the serving path. *️⃣ 𝗚𝗼𝘃𝗲𝗿𝗻𝗲𝗱 𝗿𝗲𝘁𝗲𝗻𝘁𝗶𝗼𝗻 𝘀𝘁𝗼𝗿𝗲 - 30-day window, encrypted under a customer-managed key, every record tagged with org/workspace ID, sensitivity label, and retention timestamp, with per-tenant key isolation. *️⃣ 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱 𝗰𝗹𝗮𝘀𝘀𝗶𝗳𝗶𝗲𝗿𝘀 - aggregate scanning with no human access path, producing scores and labels; only flagged content can ever advance. *️⃣ 𝗔𝗰𝗰𝗲𝘀𝘀 𝗴𝗿𝗮𝗻𝘁 - the per-transcript control point: explicit, policy-evaluated, logged, fail-closed, two-person approval for regulated data. *️⃣ 𝗛𝘂𝗺𝗮𝗻 𝗿𝗲𝘃𝗶𝗲𝘄 - scoped viewer with no export/copy/download, designated reviewer pools, need-to-know scope. *️⃣ 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗰 𝗱𝗲𝗹𝗲𝘁𝗶𝗼𝗻 𝗮𝘁 𝟯𝟬 𝗱𝗮𝘆𝘀 - origin-bound clock, derived-data inheritance. #AISecurity #AIGovernance #DataRetention #PrivacyEngineering #SecurityArchitecture #ResponsibleAI #DataGovernance #RiskManagement #CISO #CyberSecurity #TrustAndSafety #ZeroTrust #CloudSecurity #EnterpriseAI #SecurityEngineering

ClaudeDevs

LEX retweeted

ClaudeDevs

@ClaudeDevs

Jun 9

Claude Fable 5 is our first generally available Mythos-class model. It ships with new safety classifiers that may flag certain prompts in dual-use domains like cyber and bio. We've added fallbacks: a refused request retries on Claude Opus 4.8 instead of dead-ending.

0:15

199

335

5,140

394,828