A naive strategy is to split the data classwise and run conformal once per class.
But with many classes/limited data, this gives bad results (big sets, etc.)
In clustered conformal prediction, we cluster classes that have similar score distributions and pool their data! (3/4)