The LLM cake in India, or for that matter, the EU, is perhaps not a startup's to take. At some point knowledge about training diffuses enough and open-source algorithmic innovation brings down costs enough that it's commonplace to train foundational models surpassing humans in intelligence.
We may hope that at some point, by virtue of test-time scaling, the balance shifts somehow in favour of faster models: already, it appears that intelligence logarithmically scales with test-time compute. If the exponent is related to the extent of pretraining, I imagine it's possible to procure a smaller, faster model that thinks 10x the number of tokens in the same wallclock as a larger model.
In that case, the game will shift again, and may we then be ready with a manufacturing infra that can produce good ASICs.
But! There are more models to make, more modalities. Large, overparameterized models for computer-use, for agriculture, of logistics and of transport networks. Those are just as important, and perhaps one of the best uses of the superintelligence API. That game is not yet lost.
To train a GPT class 1T model from scratch - including failed runs, data acq clean rlhf, post-training, team/people will likely req $250M of compute on an aggressive 3-4mo schedule (i.e. more reserved GPUs), $500-600M all-in IF you do a dense one. MoE fp8 will cut costs by 1/10th depending on how many active params you have. If you want SOTA however, the budgets go significantly higher on test-time compute, post-training RL, and data/synthetic generations..and v. high on talent. Maybe $2-4B all-in. After that comes serving the model. The talent is key to get to SOTA/beat it - and then you have to ensure this is useful enough to have inference vol over time - for which the capital will come if there is usage / TAM. So this is not as much about raising $50-60B, or raising it all at once as the OP says - we are investors in mistral, sarvam, reflection and anthropic - and they all scaled capital over time as models got adoption, but the early bottleneck is more on talent GPUs at that scale where you can do interesting things.