We're releasing the preparedness report for Muse Spark Contemplating, MSL's extreme reasoning model, benchmarking its capabilities and behaviors in biology, cybersecurity, and more!
President Trump's AI Executive Order helps keep America in the lead on AI — encouraging innovation and taking security seriously.
We appreciate the Administration's support for public-private sector collaboration and look forward to continuing to work with the White House as it implements the President's plan.
Today @POTUS signed an EO that keeps America leading in AI while putting frontier AI capabilities to work strengthening our cyber defenses.
AI systems are now the most powerful tools we have ever had to harden our cyber infrastructure and stay ahead of adversaries. It is a real blessing that these capabilities are being developed by American industry, and not by those who would use them against us.
whitehouse.gov/presidential-…
My team at MSL is hiring for talented ML researchers & engineers with experience in agentic security, dual-use capability evaluations, and/or post-training of frontier AI models! If this sounds exciting, please reach out and DM me with your resume!
Leading on AI will accelerate scientific discovery, empower workers, and power the next great generation of American enterprise. With Muse Spark, we released Meta's Advanced AI Scaling Framework to show how we build AI to benefit and protect the public.
ai.meta.com/blog/scaling-how…
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
President Donald J. Trump launches the GENESIS MISSION - a national effort to accelerate scientific breakthroughs using AI and propel America into the GOLDEN AGE OF INNOVATION. 🚀
We're releasing a Preparedness report for Code World Model, assessing its bio, cyber, and chemical security capabilities in alignment with our Frontier AI Framework. We also provide nascent evaluations on the propensities of CWM, including honesty!
ai.meta.com/research/publica…
We’re excited to share our preparedness report on Code World Model (CWM), FAIR’s latest open-weight model for code generation and reasoning.
This report was developed by the SEAL team and the AI Security team, marking our first external publication since part of SEAL joined Meta just 1.5 months ago. It reflects our ongoing commitment to safety and alignment as we look ahead to future open-source and frontier models. We’re actively working to broaden our evaluation coverage and improve our processes, and we welcome collaboration and feedback from the broader research community. For more details, please read the full report: ai.meta.com/research/publica…
I joined @Meta AI, running preparedness and security evaluations with @summeryue0 and @_julianmichael_ to ensure that Superintelligence's newest models enable a prosperous future. Grateful for the team they built at @Scale_AI and excited for the critical work ahead.
We're releasing a multimodal benchmark for troubleshooting complex virology protocols. Expert virologists score only 22.1%, even with internet access and answering questions in their subdomains of expertise. Frontier LLMs score up to 43.8%.
Can AI meaningfully help with bioweapons creation? On our new Virology Capabilities Test (VCT), frontier LLMs display the expert-level tacit knowledge needed to troubleshoot wet lab protocols.
OpenAI’s o3 now outperforms 94% of expert virologists.
LLM benchmarks are getting quickly saturated. In response, we're releasing Humanity's Last Exam, a benchmark of extremely hard questions across math, sciences, humanities & more. Thanks to @cais, @scaleai, and thousands of expert question contributors.
We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.
State-of-the-art AIs get <10% accuracy and are highly overconfident.
@ai_risk@scaleai
ICYMI: New research from SEAL that demonstrates human red-teamers massively outperform automated methods over multiple turns.
We also release MHJ, a dataset of multi-turn jailbreaks to help enable further research into multiturn red teaming
Who's better at LLM mischief — humans or AIs? Spoiler: It's us.
Human red teamers achieve 70% attack success rates against LLM defenses that stump automated adversarial attacks. Why? We’re better at adversarial yapping.🧵
Can robust LLM defenses be jailbroken by humans?
We show that Scale Red teamers successfully break defenses on 70 % of harmful behaviors, while most automated adversarial attacks yield single-digit success rates. 🧵
Humans, noted virtuosi of adversarial yap, remain #1 at trolling LLMs!
New research from @scale_AI's SEAL team shows human red teamers achieve 70% success rates against LLM defenses that stump automated attacks, exploiting their susceptibility to multi-turn jailbreaks.
Who's better at LLM mischief — humans or AIs? Spoiler: It's us.
Human red teamers achieve 70% attack success rates against LLM defenses that stump automated adversarial attacks. Why? We’re better at adversarial yapping.🧵
I had the great pleasure of sitting next to Nat during his internship. Nat would always come in with a huge smile on his face and brighten the day of everyone around him.
He’s been hard at work all summer, and I’m super glad to see this work finally come out!
Who's better at LLM mischief — humans or AIs? Spoiler: It's us.
Human red teamers achieve 70% attack success rates against LLM defenses that stump automated adversarial attacks. Why? We’re better at adversarial yapping.🧵