Gradient Boosted Tree Inference is not what you think it is
GBT inference speed is not limited by arithmetic; it is limited by memory access. Two structural choices made at training time, tree depth and ensemble-to-core mapping, determine whether latency is...
xelera.io