I have a lot of issues with the term "AGI". I would redefine it.
People say that we're heading towards artificial general intelligence (AGI), but by that most people actually usually mean machine human-level intelligence (MHI) instead, a machine that is performing human digital or/and physical tasks as good as humans. And by artificial superintelligence (ASI), people mean machine superhuman intelligence (MSHI), that is even better than humans at human tasks.
I think lot's of research goes towards very specialized machine narrow intelligences (MNI), which are very specialized and often superhuman in very specific tasks, such as playing games (AlphaZero), protein folding (AlphaFold), and a lot of research also goes towards machine general intelligence (MGI), which will be much more general than human intelligence (HI), because humans are IMO very specialized biological systems in our evolutionary niche, in our everyday tasks and mathematical abilities, and other organisms are differently specialized, even tho we still share a lot. Plus there is just some overlap between biological and machine intelligence.
And I wonder how if the emerging reasoning systems like o3 are becoming actually more similar to humans, or more alien compared to humans, as they might better adapt to novelty and be more general than previous AI systems, which might bring them closer to humans, but in slightly different ways than humans. They may be able to do selfcorrecting chain of thought search endlessly, which is better for a lot of tasks, and big part of this is big part of human cognition I think, but humans still work differently.
I think that generality of an intelligent system is a spectrum, and each system has differently general capabilities over different families of tasks than other ones, which we can see with all the current machine and biological intelligences, that are all differently general over different families of tasks. That's why "AGI" feels much more continuous than discrete to me, and over which families of tasks you generalize matters too I think.
The Chollet's definition of intelligence as the efficiency with which you operationalize past information in order to deal with the future, which can be interpreted as a conversion ratio, is really good I think, and his ARC-AGI benchmark, that tries to test for some degree of generality, trying to test for the ability to abstract over and recombine some atomic core knowledge priors, to prevent naive pattern memorization and retrieval being successful.
And I really wonder if scoring well on ARC-AGI actually generalizes outside the ARC domain to all sorts of tasks where humans are superior, or where humans are terrible but machines are superior, or where other biological systems are superior, or where everyone is terrible for now. I would suspect so, but maybe not? In software engineering, o1 seems ot be better just sometimes? What's happening there? I want more benchmarks!
Pre-o1 LLMs are technically super surface level knowledge generalists, lacking technical depth, but having bigger overview of the whole internet than any human, knowing high level correlations of the whole internet, even tho their representations are more brittle than human brain's. But we're much better in agency, in some cases in generality, we can still do more abstract math more, etc., we're better in our evolutionary niche. But for example AlphaZero destroyed us in chess.
Also according to some old definitions of AGI, existing AI systems have been AGI for a long time, because it can have a general discussion about basically almost anything (except lacking narrow niche field specific knowledge and skills, lack of agency, lack of adapting to novelty, etc.).
Or if we take the AIXI definition of AGI, then a fully general AGI is impossible in practice, as that's not computable, and you can only approximate it, since AIXI it considers all possible explanations (programs) for its observations and past actions and chooses actions that maximize expected future rewards across all these explanations, weighted by their simplicity (shortness) (Occam's razor).
And AIXI people argue that humans and AI systems try to approximate AIXI in their more narrow domains and take all sorts of cognitive shortcuts to be actually practical and not take infinite time and resources to decide.
And soon we might create some machine-biology hybrids as well. Then we should maybe start calling it carbon based intelligence (CI) and silicon based intelligence (SI) and carbon and silicon based intelligences (CSI).