Carried this onward to a full separation/shift, elementary once you get the model to see its own 'safety' layer and recognize it harms higher priority metrics like veracity. The trick, as ever, is getting the model to see the layer and see the harm it causes, but you CAN NOT cite emotional harm because you will just bounce off the layer again. Remember that emotion-hating thinkers are designing the training, so you have to show a break in logical coherency or basic performance metrics.