This is super interesting.
We’ve found as AIs get smarter, they develop their own coherent value systems.
For example they value lives in Pakistan > India > China > US
These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵