A little over a decade ago, I visited Facebook's Prineville datacenter. Facebook had brought together a small group of systems and networking researchers to discuss the future of datacenter infrastructure.
What struck me then was how much of the physical design was organized around energy efficiency. The metric everyone cared about was PUE, or Power Usage Effectiveness: total facility power divided by IT equipment power. Older industry averages could be around 2.0 or higher, meaning that a large amount of power was spent on cooling, power distribution, and other facility overhead. Prineville represented a very different design point, with a PUE around 1.07 annualized at full load.
The architecture made that plausible. It was fundamentally air-cooled, but carefully engineered: large slow fans to pull in outside air, filtration systems designed to handle air particulate (even from forest fires tens of miles away), evaporative "swamp" coolers on the second level, hot aisle containment, and large ductless supply paths where cool air naturally sank from the upper mechanical level into the data hall.
That was the design point: move outside air efficiently, while minimizing the amount of active mechanical work required to cool the data hall.
AI infrastructure changes the thermal design basis.
For practical purposes, every watt of electrical power delivered to IT equipment must be removed as heat. Facebook later described its overall datacenter design point as about 5.5 kW per rack, with compute-heavy web-server racks around 10 to 12 kW. And if we use a much denser 20 kW CPU-era rack as the comparison point, that rack generates about 68,000 BTU/hour of heat.
The airflow math around heat transfer is straightforward. If we allow a 20°F temperature rise across the rack, the 20 kW CPU-era rack needs roughly 3,000 CFM of airflow. For comparison, a typical central AC system in a 2,500 square foot home might move roughly 1,500 to 2,000 CFM of air. Even a 20 kW rack is effectively moving the airflow of one or two homes through a single cabinet.
A current rack of NVIDIA Blackwell GPUs, such as the GB300 NVL72, is roughly 140 kW. That is about 478,000 BTU/hour from one rack. If air had to remove the full heat load, it would require more than 22,000 CFM, akin to pushing the airflow of a dozen homes through a single rack.
And that is today. NVIDIA's future Rubin Ultra Kyber rack has been reported around 600 kW. If air had to remove that full heat load, the required airflow would be close to 95,000 CFM, roughly the airflow of 50 homes' central AC systems.
More importantly, the challenge is not simply moving enough air through the rack. Modern AI accelerators concentrate enormous amounts of heat into a small physical area, so air alone becomes increasingly impractical as the primary path for removing heat from the chip package.
In other words, the engineering of heat transfer at this scale fundamentally changes. Once the rack moves from tens of kilowatts to 100 kilowatts, the cooling system changes from "move enough air through the room" to "capture heat at the components and transport it through a liquid cooling loop."
It also shortens the operational time window. A loss of coolant flow is no longer simply a maintenance issue. It can immediately affect compute availability. Operators now need visibility into coolant flow, pressure, inlet and outlet temperatures, pump state, cooling-system health, leak detection, rack thermal behavior, workload placement, and history.
There is a joke that AI factories convert energy into tokens. But like advanced manufacturing facilities, they depend on complex physical infrastructure, continuous monitoring, and operational control systems.
And increasingly, the data systems needed to understand and optimize them.
A decade ago, most of us thought of the datacenter as the substrate underneath the distributed system. Increasingly, the datacenter itself is becoming part of the distributed system.
More to write.