👀 we care a lot about correctness, ran many evals and stared at many tensors to compare them. numerics of vLLM on hopper should be solid and verified! if you run into any correctness issue on vLLM, we would love to know and debug them!
Heads-up for developers trying gpt-oss: performance and correctness can vary a bit across providers and runtimes right now due to implementation differences.
We’re working with inference providers to make sure gpt-oss performs at its best everywhere, and we’d love your feedback!