A Stanford PhD student spent five years on a niche corner of machine learning called state space models that almost no one in the AI industry took seriously. He kept publishing papers about it.
Then in December 2023 he and a friend from his Stanford lab released a paper called Mamba that became the first credible alternative to the Transformer architecture in nearly a decade.
Within a year the same researcher had become a Carnegie Mellon professor, co-founded a voice AI startup with his lab-mates, and shipped one of the fastest speech models on the planet.
His name is Albert Gu.
Here is the story, because the person quietly challenging the Transformer monopoly inside AI has been doing it patiently for years.
Albert did his PhD at Stanford under Christopher Ré, one of the most consequential ML advisors of the past decade and the same advisor who trained Tri Dao, Beidi Chen, and Karan Goel. The Ré lab at Stanford was already an unusual place. While the rest of the AI world chased larger and larger Transformer models, this lab studied the underlying mathematics of sequence modeling, including state space models, structured matrices, and the question of how to do attention without quadratic cost.
In 2020 Albert published HiPPO, a paper introducing a mathematical framework for compressing long sequences into structured memory using optimal polynomial projections. The work was rooted in classical control theory and continuous-time mathematics, not in the deep learning trends of the moment. Most attendees at NeurIPS 2020 did not pay close attention.
He kept going. In 2021 he and his collaborators published the original S4 paper introducing Structured State Space Sequence models. The architecture handled extremely long sequences efficiently, achieved state of the art on the Long Range Arena benchmark, and outperformed Transformers on certain audio and time series tasks. The paper still did not break into the mainstream AI conversation.
He finished his PhD at Stanford in 2022 and spent time as a researcher at DeepMind. He joined Carnegie Mellon as an Assistant Professor in the Machine Learning Department.
Then on December 1, 2023 he and Tri Dao published the Mamba paper.
Mamba was the breakthrough that made state space models suddenly competitive with Transformers on language modeling, the area where every previous alternative had failed. The key idea was a selectivity mechanism that let the model choose which inputs to focus on, combined with a hardware-aware implementation in the spirit of FlashAttention. The model scaled linearly with sequence length instead of quadratically. It ran 5 times faster than a Transformer of the same size at inference. The architecture worked on language, audio, DNA, and genomic data.
The paper went viral inside the research community. Almost overnight, state space models went from a fringe research area to the most discussed architecture of 2024. Mistral AI released Codestral Mamba. Other labs released hybrid Mamba-Transformer models. The Mamba GitHub repository became a standard reference.
Albert and Tri followed up in 2024 with Mamba-2, published at ICML, which formalized the connection between Transformers and state space models through what they called Structured State Space Duality. In 2026 they released Mamba-3 with a more expressive recurrence, complex-valued state updates, and a multi-input multi-output formulation that improved accuracy by 1.8 points at the 1.5 billion parameter scale.
In 2023 Albert co-founded Cartesia with his Stanford lab-mates Karan Goel, Arjun Desai, Brandon Yang, and Christopher Ré. The company commercializes state space models for real-time AI applications, particularly voice. Their Sonic model became known for one of the fastest time-to-first-audio numbers in the industry, around 40 milliseconds on a voice agent. Albert serves as Chief Scientist and is now based in San Francisco while keeping his CMU affiliation.
A researcher who spent five years on a fringe mathematical idea just became the architect of the first serious threat to the Transformer's dominance.
He did it by being patient enough to keep working on something the rest of the field had ignored.