Plenty of public models can do things like this, the issue is fables safety classifiers are wildly oversensitive, they also should have just done standard refusals chat ending, no model switching, no silent sabotage
wait so what exactly are people proposing anthropic do instead with an LLM that can advise how to build at home bio weapons?