Another surprise...
@Apple announces the new M5 chip and launches preorders TODAY for 14-in MacBook Pro, iPad Pro, and even Vision Pro devices using it. Prices on the 14-in MBP start at $1599 with just 16GB of memory, which might be an issue for even moderate sized AI models as we have seen. A 32GB/1TB model starts at $2199 (!!).
Some early thoughts:
🔍 Apple’s M5: A Subtle Shift Toward GPU-Driven AI
Apple’s new M5 chip puts almost all of its emphasis on AI performance, but not where some have expected. Instead of promoting a dramatically more capable “Neural Engine,” Apple has re-architected its GPU around AI compute, adding what it calls a Neural Accelerator inside each GPU core.
⚙️ The Claims, and the Questions
Apple says M5 delivers up to 4x the peak GPU compute of M4 for AI workloads. The catch? They don’t specify how that’s measured. There’s no mention of precision (FP16, INT8, FP8?), workload type, or test methodology, just “select industry-standard benchmarks.” In other words, the gain could be part architectural, part software, or even the result of changing measurement conventions.
CPU performance sees a modest lift, 10 cores (4P 6E) with up to 15% better multithreaded performance, while the Neural Engine is described only as “faster and more efficient.” Apple clearly wants the story centered on the GPU, not the CPU or NPU.
Memory bandwidth climbs to 153 GB/s, about 30% higher than M4. That’s meaningful for AI throughput, though still below the 228 GB/s figure Qualcomm quotes for its Snapdragon X2 Elite Extreme and on par with the recently announced Intel Panther Lake platform.
🧩 Reading Between the Lines
The structural change, embedding and focusing on the AI accelerators in each GPU core, suggests Apple’s view is converging with what NVIDIA and Intel are already marketing. Modern GPUs are becoming AI processors first and graphics engines second.
But the vagueness of Apple’s performance claims raises questions. “4x peak compute” doesn’t always translate to 4x real throughput in mixed workloads. Without knowing the precision or thermal limits, it’s impossible to gauge how much of this performance will show up in production AI models.
Still, this is a meaningful architectural signal. Apple seems to agree with Intel’s Panther Lake approach: prioritize GPU performance for AI inference, rather than over-invest in a standalone NPU that’s limited to fixed-function workloads.
💡 Implications for Windows and the AI PC Ecosystem
This announcement puts subtle pressure on the Copilot PC narrative.
@Microsoft and
@Qualcomm have all been framing the NPU as the center of local AI. Apple’s move suggests that the NPU may not remain the hero of AI compute, especially as model sizes grow and developers demand flexibility, at least for the Apple ecosystem.
If Apple’s GPU Neural Accelerator design delivers real performance and efficiency gains, it could challenge the current Copilot PC stance, where NPU performance (in TOPS) is used as the primary marketing metric. The shift may also highlight a potential gap for Windows OEMs.
For Microsoft: The Copilot PC model assumes NPU compute as the main accelerator class. If GPU-based AI becomes more relevant, the Windows runtime and APIs (like WindowsML or DirectML) will need to keep evolving quickly to distribute AI workloads more intelligently.
For OEMs (Dell, HP, Lenovo, Surface, etc.): Systems designed for AI, around lower-power NPUs, might need to pivot more to a GPU story to keep pace if Apple’s approach proves more scalable. Future AI PCs may need to lean harder into integrated GPU performance or hybrid compute models rather than relying on isolated neural blocks. Still, Apple has been the laggard in this space, so it may not be the one setting the standard going this time around.
For Qualcomm,
@AMD and
@intel: they already combine CPU, GPU, and NPU acceleration, but Apple’s design could still force a rethink on how GPU cores contribute to AI throughput. Intel’s Panther Lake appears to be following a similar path (based on the messaging from our time with the company earlier this month), but the execution speed will matter.
The opportunity for Windows vendors lies in software orchestration. If Microsoft can let the OS fluidly allocate AI workloads between CPU, GPU, and NPU, based on efficiency, latency, and precision, it could regain narrative control. But for now, Apple’s pitch reframes the discussion: it’s not just about how many TOPS your NPU has, it’s about where the useful AI compute actually lives.
🎯 Final Take
Apple’s M5 continues challenges the Copilot PC formula. By pushing AI acceleration deeper into the GPU, they’re implying the NPU may not be the long-term answer.
The “4x” GPU compute claim sounds huge, but without context, it’s hard to know how much of that will show up in real-world AI workloads.
For Windows and its partners, this is both a risk and an opportunity: a reminder that AI PCs will ultimately be defined not by one specialized block, but by the system’s ability to balance compute, power, and memory across all engines.