Answering straight up, without nuance or qualification (e.g. Continuous Learning
#CL would completely blur line between training and inference)…
1️⃣ What percentage of current AI cloud revenue and semi demand is lab training?
Training 30~50% of current AI cloud revenue/spend (and 40~60%) of leading-edge GPU demand rn.
Inference is majority of ongoing workloads and is growing faster, ~½ of AI compute in 2025 and ~⅔ in 2026.
2️⃣ What percent of forward views of demand are driven by lab training?
Training demand consensus declining rapidly, with 20~40% 2026-27e growth, falling LDD/HSD thereafter, because Ship of Theseus model improvements elongate major new model release cadence (e.g. fine-tuning/post-training/RAG on smaller cheaper runs rather than training 100B parameter models from scratch).
Inference estimates 70~90% of total AI compute by 2030.
3️⃣ How much of this training demand is required if the future is enterprises doing post-training on open source models?
Widespread open source adoption could cut overall training compute ~50% maybe, but already 76% of enterprise LLM users leverage some open source, which is constrained rn (e.g. data quality, talent, final-mile optimization).
4️⃣ How much inference capacity is unlocked if all this old training capacity is repurposed for inference?
Substantial, maybe 30~50% of current high-end GPU capacity, already reflected in most analysts' capex models.
6️⃣ How do inference demand projections change if on-device inference takes off (like Apple bulls think with new Siri AI)?
Edge inference ~70% share in certain segments, if successful at scale (privacy, latency, zero marginal cloud cost) – offsets some hyperscaler revenue but doesn't kill enterprise/cloud use cases needing massive context or heavy compute.
7️⃣ Is GPU, CPU, and memory demand constant across workloads? How do they vary?
Training is GPU-heavy (high parallel FLOPS/HBM memory) and lower CPU.
Inference varies, with GPUs dominant; CPUs increasingly prominent for orchestration, preprocessing, agentic, RL; and memory critical for KV cache (DRAM/HBM) and storage (NAND/SSD).
Agentic and enterprise workloads usually increase CPU:GPU ratios in favor of CPU.
8️⃣ How important is latency for the majority of future enterprise workloads (low if agentic)?
Latency matters but usually secondary for many enterprise/agentic cases. More here:
x.com/i/status/2054976988775….
.
.
.
1️⃣0️⃣ Are memory requirements for NAND and SSD also constant? How do they vary?
Training needs massive storage for datasets/checkpoints.
Inference varies – high for large context windows/KV cache (but often in DRAM); lower for cached/static models; RAG/vector DBs boost SSD/NAND demand; while enterprise on-prem may increase local storage needs vs cloud.
1️⃣1️⃣ What % of future enterprise open-source workloads will be on-prem vs cloud? How does that compare vs today?
Hybrid, with on-prem share 30~60% (e.g. sensitive/regulated workloads).
Underappreciated how people/businesses/enterprises will be running agentic AI workloads ambiently overnight – load balancing everything from compute to energy…
Somehow also underappreciated how much demand for AI tokens during business hours – latency and inference speed will always matter for huge swath of market…