You can't have portability AND peak performance. That's the assumption people make.
They're right if you rely on portability by abstraction at runtime. But that assumes a cross-vendor system must be a "lowest-common-denominator" system.
Generality is free when your abstraction sits at the right level. Dive into Part 4 of my series, "Why hardware-agnostic isn't the same as lowest-common-denominator": scale-lang.com/posts/2026-06…
Have I not managed to convince you yet?
Come hang out in our Discord to keep the debate going. It's where we talk compilers, GPU programming, and other nerdy things with our dev team: scale-lang.com/s/discord?utm…
𝗦𝗽𝗲𝗰𝘁𝗿𝗮𝗹 𝗖𝗼𝗺𝗽𝘂𝘁𝗲 𝗶𝘀 𝗻𝗼𝘄 𝗽𝗮𝗿𝘁 𝗼𝗳 𝘁𝗵𝗲 𝗡𝗩𝗜𝗗𝗜𝗔 𝗜𝗻𝗰𝗲𝗽𝘁𝗶𝗼𝗻 𝗽𝗿𝗼𝗴𝗿𝗮𝗺. 🟩
Inception is @nvidia's program for AI startups - a membership that gives access to technical resources, preferred pricing on NVIDIA hardware and software, and exposure to a global network of investors and partners.
CUDA is the de-facto standard for AI developers, and we’re honored to play our part in growing the ecosystem.
From CUDA portability to real-world performance on @AMD GPUs, @SpectralMichael, CEO of @SpectralCom shares why open, portable AI infrastructure matters more than ever.
Watch the short interview from Beyond Summit 2026 on our YouTube channel: youtube.com/watch?v=cwmiGYas…
Everyone says CUDA can't target TPUs.
What they mean is nobody has written the compiler that raises CUDA code to something a systolic backend can consume.
Those are very different sentences.
Full post — Part 3 of why @SpectralCom exists: tinyurl.com/mtnxcjsw
𝗨𝗽 𝘁𝗼 𝟮𝟱.𝟳× 𝗳𝗮𝘀𝘁𝗲𝗿. Unmodified CUDA. AMD silicon.
Thanks to the @tensorwave team for benchmarking SCALE on MI355X and publishing the numbers.
Port to AMD used to mean a rewrite. Now it means a recompile.
Spectral Compute (@SpectralCom) used TensorWave’s AMD-native infrastructure to benchmark CUDA portability and performance on @AMD Instinct™ MI355X GPUs.
See how they did it - tensorwave.com/blog/spectral…
Obvious take: if agents can write native code for any GPU, who needs a portable CUDA toolchain? Just point the model at each GPU. I think that's exactly backwards — and wrote up why. tinyurl.com/5ac6y5jt
CUDA isn't a standard. No consortium, no ratified spec. It's whatever NVIDIA ships next. That's supposed to be a weakness. It's why CUDA wins. The cross-vendor layer can't be a consortium either. It has to be a company. That's what we're building at @SpectralCom. SCALE compiles unmodified CUDA on AMD and NVIDIA today. No PDF to argue about — a toolchain. Full piece: [bit.ly/3PBsYsU](bit.ly/3PBsYsU)
AI compute is bottlenecked. When the hardware mix inevitably diversifies, how will production clusters actually operate?
Our CEO @SpectralMichael is joining @AkashBajwa96's Gradient Descending roundtable this Wednesday to dig into the evolving hardware landscape.
What does the AI cluster of the future look like?
Hint: it won't just be a wall of B200s.
Our CEO Michael Søndergaard is joining @AkashBajwa96 for the next Gradient Descending roundtable to discuss ASICs, non-NVIDIA chips, and breaking the software lock-in.
🗓️ Feb 25 | 8:30 AM GMT 🎟️ Request invite: luma.com/f7s69jj2#AI#HPC#DeepTech#AMD#NVIDIA#ASIC