In new Anthropic Fellows research, we discuss “introspection adapters": a tool that allows language models to self-report behaviors they've learned during training—including potential misalignment.
Can LLMs simply tell us about unwanted behaviors they’ve picked up in training?
We train a single Introspection Adapter (IA) that makes fine-tuned models describe their behaviors.
It generalizes to detecting hidden misalignment, backdoors and safeguard removal.
New on the Science Blog: We gave Claude 99 problems analyzing real biological data and compared its performance against an expert panel.
On 23 problems, the experts were stumped. Our most recent models solved roughly 30% of those—and most of the rest.
Claude Code ships with a built-in skill for working with the Claude Platform.
Useful for model migrations, using API features (e.g., prompt caching), or onboarding to newer APIs like Claude Managed Agents.
The Built with Opus 4.7 Claude Code hackathon is a wrap!
Thank you to the 500 participants worldwide, and to @cerebral_valley for co-hosting.
Here's how the winners combined multi-agent orchestration, persistent memory, MCP tools, sandboxed execution, and smart prompt design 🧵
Claude now connects to the tools creative professionals already use.
With the new Blender connector, you can debug a scene, build new tools, or batch-apply changes across every object, directly from Claude.
Another Claude Code hackathon comes to an end.
Thank you to everyone who spent a week building with Opus 4.7, and to @cerebral_valley for co-hosting.
Introducing the winners:
Claude Security is now in public beta for Claude Enterprise customers.
Claude scans your codebase for vulnerabilities, validates each finding to cut false positives, and suggests patches you can review and approve.
We created a Full Movie for @Pumpfun To Unite the trenches.
Featuring your favourite KOLs, and Alon! Only on Pumpfun Live.
We hope to fulfill @a1lon9's vision of PF Entertainment.
Trailer Below $MOVIE