When a machine learning system uses argmax to select outputs from a probability distribution — and most of them do — it's a clue that it might be biased. That's because argmax selects the "most probable" output, which may amplify tiny data biases into perfectly biased outputs.