A dozen finance-bros and consultants asked me how I keep up to date with AI.
We may be reaching the peak of inflated expectations before the trough of disillusionment.
But I still think itās a great time to capture that initial excitement and give enough of a jumpstart so that one is motivated to cross the through.
LEVERAGE CURIOSITY
Learning is a form of leverage to take better decisions in the future, but the long term motivator is learning for the sake of learning.
So the first question to ask is if you are truly curious about AI. Iām both interested in AI research for its own sake and in its business implications.
The second question to ask is how much time is reasonable to invest in Learning vs Doing (the former you learn quickly but not deeply, the latter you learn slower but deeply, since you learn from first-hand experience and mistakes).
FOUNDATIONS HISTORY
If you are indeed curious enough to dedicate several hours a week on a new topic, then you should start by learning the fundamentals.
A friendly way of starting is by watching the
@3blue1brown series on Neural Networks. If you donāt have Linear Algebra background, it is worth watching first the 3b1b series on Linear Algebra.
If after that you are even more motivated to learn, you should read about the history of Neural Networks (NNs):
1970s: NNs dismissal and AI Winter
1998: CNNs (Computer Vision architecture)
2014: AlexNet (scaling CNNs produced great results)
2017: Transformers (Language Model architecture)
2020: GPT-3 (scaling Transformers produced great results)
2023: Distillation (scaling training data with GPT4 outputs produced great results)
2024: Reasoning (scaling inference time token generation produced great results)
2025: Reinforcement Learning (scaling training in verifiable domains will produce great results)
You can see the pattern here, the Bitter Lesson is that simple architectures that scale well outperform complex ones that donāt scale as well.
So the primary vector for progress is increasing computational and energy capacity to scale models even further.
Which means that Mooreās law and the chip manufacturing value chain (NVIDIA -> TSMC -> ASML) play a crucial role.
But one should also beware of the limitations of the current Transformer architecture and prepare for eventually hitting a wall.
So research cannot have all eggs on the same basket and serious effort is being put on alternative architectures and approaches.
The reason research in AI moves at such a fast pace is because of a property of Computer Science that distinguishes it from other Sciences.
New developments are trivially reproducible when the software is open-source.
This property allows for rapid spread of information with much less need for peer-reviews and journal publications.
Lately this property is no longer fully applicable, since the major AI labs donāt do a lot of open research and the training costs of state-of-the-art (SOTA) models require millions or billions in compute.
SOTA
Now with greater contextual awareness, itās worth moving from general news outlets to more in depth coverage of AI developments.
The quickest update is the
@Smol_AI newsletter, less than 1min read a day, with updates from the major AI labs.
To listen more from researchers follow the
@dwarkeshpodcast.
To deep dive on SOTA research, you need to actually take the time to read the papers on arXiv.
Maybe read some of the classics while you learn about the history of NNs and then do a random walk through the main conferences (NeurIPS, EMNLP), finally follow your curiosity through the tree of citations.
BUSINESS IMPLICATIONS
The chatGPT moment was about productizing a technology so general that OpenAI didnāt know how to productize it at first, so they launched an API to let others figure out the monetization.
OpenAI only became the accidental consumer AI company when they trained GPT-3 on human feedback and launched GPT-3.5 (in the user friendly interface of chatGPT).
To better understand the business dynamics involved, start by learning how the internet disrupted consumer markets.
The Aggregation Theory explains that the profits accrue to who has the relationship with users, commoditizing the rest of the value chain.
Then subscribing to
@stratechery will give you a view of the tech news through this Aggregation Theory lens.
Then take into account that, in the age of AI, the marginal costs are not zero, to be on top of the infrastructure implications read some of the
@SemiAnalysis_ articles.
To learn about the history of great businesses and entrepreneurs you should listen to
@AcquiredFM and
@FoundersPodcast.
To be on top of the internet culture you are in the right place here on X, follow the
@tbpn show and see some of the people I follow.
All these suggestions form a highly curated but still overload of content.
So keep in mind the trade-off of Learning vs Doing and invest time learning how to do.
Learn to code and to sell, in order to build.
The best way to predict your future is to create it ~ Abraham Lincoln