Filter
Exclude
Time range
-
Near
In our latest Algorithmic Voice episode, we dissect Anthropic’s Constitutional Classifiers—a novel approach using natural language rules to safeguard LLMs. 🎧 Watch here: youtu.be/9iP0A1zVRKc #AI #ConstitutionalClassifiers #Anthropic #TheAlgorithmicVoice #AISafety #LLM

1
2
21
🚨 AI Security Failure 🚨 @AnthropicAI I bypassed the #constitutionalclassifiers designed to block harmful content and extracted detailed chemical information on a restricted substance. Despite passing multiple safeguards, the content checker failed to flag it as harmful. Here’s what happened: 🧵
2
1
13
1,572