"I don't really care what the total Token volume is, nor what the total revenue is." Huawei Cloud CEO Zhou Yuefeng's opening remark at the June 5 INSPIRE conference in 2026 was a deliberate break from the AI cloud narrative that has dominated the past six months.
Alibaba Cloud and ByteDance's Volcano Engine have anchored their AI cloud stories on daily Token call volume and MaaS revenue scale. Model labs including Moonshot AI, DeepSeek, and Zhipu have kept cutting inference prices. The industry conversation has orbited around one question: how many Tokens are moving through whose pipes.
Zhou offered a different answer and a different KPI. What matters, he said, is "whether each Token genuinely improves productivity."
Three structural choices make this more than rhetorical positioning.
The first is the compute substrate. Huawei Cloud runs on a fully domestic stack: Ascend, Kunpeng, CANN, EulerOS, all self-built. Zhou was direct about the constraint: Huawei cannot use anyone else's compute, so it must turn domestic chips into an industry-grade answer rather than compete on raw scale with the NVIDIA-plus-public-cloud plane. "I'm not willing to compare revenue or ranking with other cloud companies," he said. "It doesn't mean anything."
The second is the customer base. Where internet-platform clouds lean on consumer traffic and developer ecosystems, Huawei Cloud concentrates on government, finance, and state-owned enterprises. Its hybrid cloud has held the top market share in those verticals for years, serving over 5,500 global clients. Zhou's advice to such clients reflects Huawei's own architecture: do not build your own 100,000-card cluster. Keep data local, pull AI compute and models from the public cloud, and use confidential inference and training techniques to bridge the sovereignty-scale gap. The pitch is essentially a pipeline for delivering public-cloud iteration speed to organizations that cannot fully migrate.
The third is the ecosystem bet. Huawei open-sourced CANN, EulerOS, the CCE Volcano scheduler, and the ModelArts toolchain. The new agent platform AgentArts ships an open-source edition, openJiuwen, sharing over 90% of its core with the commercial version. At the conference, Huawei also announced a partnership program with more than 20 model providers, including Zhipu, DeepSeek, MiniMax, Kimi, StepFun, Baidu, Meituan's LongCat, and iFlytek Spark.
The logic is straightforward: when domestic compute remains capacity-constrained and capability-limited, the second compute plane can only establish itself by offering the widest possible model selection and the most open ecosystem.
That hardware and customer reality sets the boundaries. The positive bet is what Huawei calls "Agentic Infra," a four-part product stack that reframes AI cloud competition around whether agents can actually run inside enterprises rather than how many Tokens they consume.
The components target the engineering problems that block enterprise agent deployment. The AICS cluster compresses Token latency on a 100,000-card system to under 10 milliseconds. The AMS memory storage uses NPU-direct access to content management systems for petabyte-scale memory, addressing long-horizon task retention for agents. The CCE Volcano Next scheduler runs training and inference on shared pools, lifting resource utilization by over 30%. The AgentSphere security runtime uses lightweight sandboxes to achieve 100-millisecond startup and batch creation at 100,000 instances per minute.
ModelArts Next adds model routing across 15-plus SOTA models with cost-priority, performance-priority, and balanced modes, claiming over 95% routing accuracy and average cost reductions of 20%.
The clearest sign of where Huawei Cloud wants to differentiate showed up in the four industry-specific "AI Dream Factories" it launched: smart healthcare, embodied intelligence, smart manufacturing, and scientific computing.
On healthcare: a pathology model built with Shanghai's Ruijin Hospital, RuiPath, is already deployed across more than 20 hospitals at the Grade A, municipal, and county levels, from Handan to Qianxinan. Diagnostics expertise that has always been locked inside senior pathologists is now being distributed as a cloud service to county hospitals for the first time.
On embodied intelligence: CloudRobo, billed as the first full-stack embodied intelligence development platform, aims to serve the toolchain needs of over 300 Chinese embodied AI startups.
Zhou's industry thesis was explicit. Healthcare and finance are China's most digitized, most data-rich industries. If AI cannot work there, he argued, it is unlikely to work anywhere else. And in those industries, the right metric is not daily active users or Token volume. It is the financial risk prevention rate, the improvement in credit efficiency, the probability that a patient in a remote area gets an accurate diagnosis.
What has been written here before about China's AI competition being less an OpenAI race than a set of embedded business systems holds particularly clearly for what Huawei Cloud is attempting. The company is using domestic compute plus open-source infrastructure to cover government and enterprise, hybrid cloud plus confidential computing to serve regulated clients, and Agentic Infra plus industry verticals to shift the fight from selling Tokens to selling productivity.
That path is slower than chasing MaaS revenue growth and harder to dress up as a quarter-over-quarter number. But it sidesteps the most intense price competition in AI cloud today and places a different wager: that when agents genuinely enter industry, the infrastructure layer that underpins them will matter more than who sold the cheapest Token.
Zhou put it plainly: "I have no way to build a silicon black soil out of parts from every country." The question Huawei Cloud is now posing is whether a domestic compute system, built under constraint, can meet what Chinese industry will actually need from AI.
Robotics
#ChinaAI
---