Stop being rude.
1. Because there's no true concurrency with llama.cpp, and there's no way to justify the cost of Nvidia inference cards without multi user concurrency. Single stream Kimi K2.5 for $60k is a mean joke.
2. Yes, I have a $20k cheaper way to do that vs prettybox. Buy two more 6000 Pro, and hook it up to a Wrx90 chipset. Guaranteed entire system <$40k.