Models I’d be most excited to see:
* Llama 3.2 8b-level model with sub-100ms latency (OpenRouter: Groq’s latency is 390ms)
* Better, faster models for <32GB local RAM
* Sonnet 3.5, but cheaper & better multilingual
* The obvious: new frontier models, reliability across the board