Filter
Exclude
Time range
-
Near
Replying to @tmcosmos_org
このコメントは、かなり衝撃的に見えますが、現在のAI研究の流れからすると「完全な荒唐無稽ではない」が、「どこまで本当に自律的にできていたか」は慎重に見る必要があります。 1. 何が主張されているのか この研究者グループは、 Claudeを中心としたAIシステムを構築 研究テーマを与える 文献検索 理論計算 数値シミュレーション 数式の検証 他のAIによるクロスチェック 論文執筆 査読 リバイズ までを自動化した、と言っています。 つまり、 「研究者一人+多数のAIエージェント」 ではなく、 「AI研究所」 を作ったという話です。 ⸻ なぜ理論物理学で起こりやすいのか 理論物理学や数学は、 データ取得が不要 実験装置が不要 数式で記述可能 正誤判定が比較的明確 という特徴があります。 したがって、 AIに最も置き換えられやすい学問 とも言われています。 ⸻ 現在すでに存在する技術 ① 文献検索 Claude GPT-5 Gemini など。 ⸻ ② 計算 Python Mathematica SageMath ⸻ ③ 数学的証明 Lean Coq ⸻ ④ AI同士による査読 複数のLLMを Author Reviewer Critic として動かす仕組み。 ⸻ ⑤ 自律エージェント 最近では AutoGen OpenHands Manus Fable などが登場しています。 「Fable 5」は、おそらくその種のマルチエージェントシステムを指しているのでしょう。 ⸻ 「開発者でも思いつかない非自明な結果」 ここが最も興味深い部分です。 実は近年、 AlphaTensor による行列積アルゴリズム発見 や、 AlphaFold によるタンパク質構造予測 など、 人間が長年気付かなかった解をAIが見つける例が現れています。 また、 Google DeepMind の AlphaEvolve(2026) では、 アルゴリズムや数学的構造をAIが改良し、 人間専門家を上回る例が報告されています。 したがって、 「世界的理論物理学者でも思いつかなかったアイデア」 自体は十分あり得ます。 ⸻ 最も重要な部分 このコメントで一番注目すべきなのは、 人間がAIの出力を理解することが律速段階 になったという指摘です。 これは現在AI研究者の間でしばしば語られる 「Interpretability bottleneck」 (理解可能性のボトルネック) です。 従来 人間 ↓ アイデア ↓ 計算 ↓ 論文 でしたが、 今後は 人間 ↓ 研究テーマ AI研究所 (数百エージェント) ↓ 大量の成果 人間 ↓ 理解できるものだけ採用 という形になる可能性があります。 ⸻ ただし「完全自動化」には疑問もある 現在のLLMには ハルシネーション 架空引用 論理飛躍 長期整合性の破綻 があります。 したがって、 「1日で世界的論文完成」 という表現は、 かなり誇張されている可能性があります。 実際には 人間が作ったフレームワークの中でAI群が働き、人間が最終確認した という形だった可能性が高いでしょう。 ⸻ 仮に本当なら何が起きるか これは理論物理学だけでなく、 数学 生物情報学 神経科学 医学 薬剤開発 にも波及します。 特に先生が関心を持たれている医学研究では、 論文数百万本の読破 ↓ 仮説生成 ↓ 統計解析 ↓ 図表作成 ↓ 原稿執筆 ↓ 査読対応 までをAIエージェントが担当し、 人間は 研究テーマ設定 倫理判断 臨床的意義の評価 に集中する時代が来るかもしれません。 ⸻ このコメントの信憑性 内容 妥当性 Claude中心の研究エージェント構築 非常にあり得る 文献検索~執筆自動化 ほぼ実現可能 AI同士による査読 実現可能 開発者も思いつかない結果 十分あり得る 1日で高度な論文生成 やや誇張の可能性 完全無人でNature級論文完成 まだ難しい 人間が理解することが律速 今後数年で現実化する可能性大 つまり、 SFの話ではなく、「2026年現在の最先端研究者の間では現実になりつつある話」 と考えるのが妥当でしょう。 そして、おそらく最も大きな変化は、 「AIが人間を支援する」時代から、「人間がAI研究所の研究成果を理解する」時代への転換 なのかもしれません。
4
24
3,130
AI is reshaping discovery in maths and physics — proposing conjectures, speeding proofs with provers and LLMs, finding algorithms like AlphaTensor, and extracting physical laws. It accelerates insight and reveals hidden patterns. nature.com/articles/d41586-0… #AI #Physics
6
Replying to @Noahpinion
I don’t think it’s that humans can’t to it. It’s just that not many humans can, and the humans that genuinely move the frontier on theory can easily be a few hundred years apart. So the speed with which the frontier might move with machine intelligence will be considerably different. (I quite like AlphaTensor as a case study here). The chance of new discoveries and abstractions being taught to other humans I think actually improves. Very unclear whether our capacity to learn as a species is anywhere close to saturated.
1
7
2,016
Replying to @Houda_nait
Talk about hyperbole. Look at Alphafold, AlphaMissense, AlphaGenome, AlphaTensor, GNoME, etc. Far bigger impacts.
5
903
Problem 3: Matrix Multiplication Tensor Rank The goal is to find an algorithm that multiplies two matrices using fewer scalar multiplications than the current record, for a chosen size. Work like AlphaTensor and AlphaEvolve from @GoogleDeepMind are all useful references here.
1
2
99
Apr 23
Replying to @Star_Knight12
It means you have not seen --AlphaFold, AlphaGo & AlphaZero, AlphaGeometry, AlphaDev / AlphaTensor, AlphaEvolve, AlphaGenome & AlphaMissense, AlphaProteo, AlphaEarth Foundations and Gemma -- yet. Go check my friend, world of AI/ML is bigger than "Coding assistants"
1
4
64
2,603
AI导致失业,是真问题,是大问题。 今年年初Block因为AI裁员四成,股价却暴涨25%,市场在明确告诉你,华尔街爱的就是这个故事。你焦虑是对的。 但AI最大的问题,可能不是让人失业。而是一个本可以去治病、去发现新材料、去搞清楚宇宙怎么回事的工具,被迫优先去帮企业省人头。 整个AI行业正在集体走一条可能不是最优的路,而我们所有人都在为此买单。 昨天看了Google DeepMind的掌门人、诺贝尔奖得主Hassabis,在最近一期深度访谈里说了一句话:"如果由我做主,我会让AI在实验室里多待几年,多做几个AlphaFold,也许就把癌症攻克了。"。"如果由我做主",有几分怨气、无奈。这个人是Google DeepMind的掌门人,诺贝尔奖得主,但他造什么、造的东西往哪走,已经不完全由他说了算了。 要理解这句话的分量,得先知道他到底做了什么。 很多人听过AlphaFold,知道Hassabis拿了诺贝尔奖,但不太清楚它在干什么。简单说:蛋白质决定你体内几乎所有的生物过程,它的三维结构决定功能。 要开发新药,你得先搞清楚目标蛋白质长什么样。但过去五十年,搞一个蛋白质结构要花数十万美元和好几年,用X射线一点点"打"出来。 AlphaFold把这件事从几年变成了几秒钟。 但最打动我的不是算法本身,而是Hassabis在一次内部会议上做的决定。当时团队在讨论怎么搭服务器,让全球科学家一个一个提交请求、排队等结果,这是过去几十年整个领域的标准操作。 然后Hassabis算了一笔账:已知蛋白质两亿个,每个十秒,大概一年能算完。他说:我们为什么不把所有蛋白质全部折叠完,直接免费公开?服务器不搭了,邮件系统不做了。直接把答案全算出来,送给全世界。 这个决定之后,超过三百万科学家在用AlphaFold。研究疟疾、利什曼病这些影响几亿人但大药企懒得碰的疾病的非营利组织,以前连蛋白质结构都搞不起,现在直接跳到了药物设计阶段。一位药企科学家说:几乎从现在起开发的每一种新药,流程里都会用到AlphaFold。这才是AI应该在做的事。 但OpenAI改变了一切。2022年底ChatGPT炸了。整个行业的重心,一夜之间转向了大语言模型和消费级产品。一开始Gemini效果不佳万人嘲,Google进入"红色警报",Hassabis从一个做科学研究的实验室负责人,变成了整个Google AI的掌舵者,Gemini、消费产品、季度财报、市值竞争,全压过来了。 他在访谈里坦承:"现在的世界不是我多年前梦想的那个。"他理想中AGI的开发方式,是像CERN(欧洲核子研究中心)那样,全球顶尖科学家协作,每走一步都充分理解、充分验证,哪怕多花十年二十年。但OpenAI用一个聊天机器人定义了整个行业的方向。从那一刻起,几乎所有实验室包括DeepMind都被拖进了同一条赛道:谁的模型更好、谁的产品更快、谁能让企业裁更多人。 Block裁员四成股价涨25%,这不是个例,这是市场给整个AI行业发的信号:我们要的是提效、是裁员、是季度利润。至于治病、探索宇宙那些?等赚完钱再说吧。这里有一个很残酷的结构性问题:全世界最强大的AI,其开发节奏现在是由商业竞争决定的,不是由科学价值决定的。 连Hassabis自己都觉得不对,但他被锁在这个格局里。 我不是说大模型不重要,AI提升效率不重要,它当然重要。企业降本增效、个人工作提速,这些都是实实在在的价值。但问题是:当整个市场、整个投资逻辑、整个行业的注意力都all-in在这一个方向上时,另一条路就没人走了。 而那条路上还有什么?Hassabis团队的Isomorphic Labs在重塑十年周期、90%失败率的制药流程。AlphaGenome在解码98%的非编码基因组,诺贝尔奖得主Jennifer Doudna说,如果足够准确,配合CRISPR基因编辑,我们可能直接修复导致遗传病的基因突变。AlphaTensor发现了比人类更快的矩阵乘法算法,AlphaChip在芯片设计上超过了人类工程师。这些东西没有一个出现在你手机上,但它们解决的每一个问题,影响范围都远超任何一个LLM的更新。 而做这些事的资源、注意力和人才,正在被LLM军备竞赛大量吸走。 这条科学路线的潜力到底有多大?2016年AlphaGo下出人类棋手视为禁忌的"第37手",最终成为决定胜负的关键。后来的AlphaZero不看任何人类棋谱,一天之内从随机乱下到超越世界冠军。 AlphaFold正是从这条线上长出来的。这种"在规则空间里自主探索、找到人类从未想到的解"的能力,才是AI真正颠覆性的部分,不是帮你写更好的代码,是帮你发现一种根本不知道存在的药物分子。 但现在这条线在萎缩,那条线在膨胀。 我不是要捧Hassabis,他自己也承认被竞争裹挟、很多事身不由己。但这期访谈让我想明白一件事:LLM让人失业是真问题,但整个行业被锁死在"LLM"这一条路上,是更大的问题。这条路的方向,某种程度上是OpenAI用ChatGPT铺出来的。 在那之前,DeepMind在做AlphaFold,在折叠两亿个蛋白质送给全世界。在那之后,所有人都在做LLM。 他在访谈最后说:"我们甚至不知道时间是什么。我们身在其中,却无法满意地描述它到底是什么。"AI对他而言不是目的,是帮人类看清宇宙的望远镜。 而现在这架望远镜被拿去当放大镜,放大季度财报上的数字。那么,下一个AlphaFold还有没有机会被做出来?还是所有资源都去卷下一代LLM了? AI最大的浪费就在这里。这不是技术问题,是选择问题。 而现在做选择的,不是科学家,是市场。 市场也许没错。但市场只回答一个问题:下个季度怎么赚更多的钱。它不回答:人类下一步该往哪走。
1
376
The only employee you need in the Agentic AI era 🐶 It used to be a flex how many employees you had. A team of 50 meant you were serious. The companies I'm most excited about are running lean on purpose. Let’s go spend time in the lab instead, working with our hands ⚛️ Unfortunately, a lot of tools are overindexing on doing things scientists enjoy (eg. ideation) instead of things they hate (eg. scheduling meetings). I got a gasp at NVIDIA GTC when I said I'm a little scared of OpenClaw. But, I’ve been running custom agents for over a year. If you email me for a meeting, it's probably not me responding, tbh. My Telegram has a bot. Whatsapp… you still get me. So far. I am becoming ungovernable. Now I get to sit with research, papers, and code, then talk to people who are smarter than me on what they are building and thinking about. My stack: GCP, models attached to only specific folders of context, for me, that’s very specific folders in Google Drive, my Trello to-do list, my email, and my calendar — my memory, essentially. Feels safer. Honestly, a lot of my day is governed by AI and automations - like a nanny. Like Jensen said, AI is making us MORE busy. But we don’t want to be swapping thinking for AI slop. For scientists and researchers, everyone wants to show you the most complicated AI setup possible. My actual recommendation: automate one step of your process. Just one. Then spend the time you got back on thinking. For synthesis: NotebookLM — upload your papers, ask questions across all of them. Genuinely changes how fast you can move through a new field. It’s not writing posts, but it’s helpful for understanding. Additionally, since I *need* a 45-minute walk every morning to stay sane, using the “podcast” feature for papers is fun. I can get through 4 papers a morning! And three ways AI/ML is getting us to quantum advantage faster: — Q-CTRL Fire Opal (q-ctrl.com/fire-opal) — AI error suppression that runs on top of whatever hardware you're using. You don't change your algorithm. It just works better. — AlphaTensor-Quantum (lnkd.in/gChGKF5H) — Google DeepMind Quantinuum. RL-based circuit optimization that minimizes T-gates. Found circuits that beat the best human-designed solutions. Saves hundreds of hours of research. — AlphaQubit 2 (arxiv.org/abs/2512.07737) — Neural network decoder for quantum error correction. Real-time decoding under 1 microsecond. First-ever real-time decoding of the color code. For everything else: Claude for reasoning and writing. Gemini for learning. ChatGPT voice mode to yap about new ideas. I’m still on Cursor for code for the nice UI, but also running agents parallel in Claude Code. Pick one. Automate one step. Go back to thinking. P.S. will be speaking more on quantum EDA and optimization on hardware at DAC 2026 in Long Beach! See you there. What's on your desktop? Especially if you're in research or deep tech — drop your stack below. ⚛️
3
12
837
Replying to @AnthropicAI
Anthropic drops Claude Opus 4.5 into theoretical physics, guided by Harvard's Matthew Schwartz—a string theory heavyweight. Timeline kicks off March 2024 with Claude 3 Opus launch, topping benchmarks but lagging in reasoning depth. Fast-forward to June: Claude 3.5 Sonnet crushes coding and vision tasks. Now, Opus 4.5 preview tackles grad-level QFT calculations, slashing a multi-day symbolic integral from 72 physicist-hours to under 2 hours with AI prompts. Key stats hit hard: Schwartz's team computed a 100-term Feynman diagram expansion—manual drudgery that'd take weeks solo. Claude nailed 95% accuracy on first passes, iterating fixes in minutes. Valuation context? Anthropic's at $18B post-Amazon's $4B infusion (Sept 2024), with 100M weekly users via Poe and API. Compare: OpenAI's GPT-4o clocks similar math feats but flubs physics edge cases 20% more per LMSYS arena. Physicists cheer acceleration—Schwartz calls it "game-changer for exploration." But users? Frontier researchers gain 10x speed on integrals, freeing brainpower for hypotheses. Competitors sweat: DeepMind's AlphaTensor sped matrix mults 20x in 2022, yet physics lags. Investors pile in—Anthropic raised $8B total, betting AI augments Nobel hunts. Causal chain's clear: Massive RLHF on code/physics corpora lets Claude pattern-match Lagrangians flawlessly, but originality stalls sans real-world experimentation loops. Training data from arXiv/papers bridges gaps, yet hallucinations spike in uncharted territory. Result? AI as turbocharged assistant, not inventor—echoing AlphaFold's 2020 protein revolution, folding 200M structures in months vs. decades manual. Industry parallels scream progress: Google's FunSearch (Dec 2023) autonomously found math proofs, boosting cap set bounds 10%. Outcome? Pure math advanced sans humans leading. Physics next? Absolutely—Claude 4.5 proves AI won't replace theorists, but sideline them at your peril. Anthropic just lit the fuse; physics PhDs, adapt or get lapped.
1
202
#2/2 Today ... I would like to shed some light on the area of autoresearch systems. The evolution of these systems follows a clear trajectory: from Google DeepMind's proprietary AlphaEvolve (2025; Alpha Evolved was inspired by FunSearch (Nature, 2023) and AlphaTensor (Nature, 2022)), which demonstrated breakthrough results in mathematics and infrastructure optimization but remained closed-source, to a rich ecosystem of open-source alternatives that democratize evolutionary code optimization while introducing novel mechanisms like prompt co-evolution, Pareto-efficient search, and skill learning. Below is an extract from my paper about OmniEvolve (it will be published in a weeks) - the classification of evolve systems and their representatives. They were created much earlier than autoresearch - they are more extensive, unfortunately they have less popularity. Again congratulations to Karpathy who made great impact on AI community and promoted this kind of systems. Sorry if I missed something (post was created during flight so no rich documents access). A. General-Purpose Evolutionary Frameworks •AlphaEvolve - A Gemini-Powered Coding Agent for Designing Advanced Algorithms - DeepMind, 2025 •OpenEvolve - Open-Source Reimplementation of AlphaEvolve for LLM-Guided Evolutionary Coding - Algorithmic Superintelligence Inc. •ShinkaEvolve - Open-Ended Program Evolution Framework - Sakana AI •GEPA - optimize anything - UC Berklay, Standford •LLMAD - Unified Open-Source Platform for LLM-based Automatic Algorithm Design - City University of Hong Kong & Southern University of Science and Technology •SkyDiscover and AdaEvolve - A Modular Framework for AI-Driven Algorithmic Discovery with Hierarchical Adaptive Search - UC Berkeley Sky Lab B. Self-Improving Agent Systems •Darwin Gödel Machine - Sakana AI, 2025 •Darwinian Evolver - Evolving Programs Through LLM-Guided Darwinian Natural Selection - Imbue Research •GEPA Skills - Automatically Learning Skills for Coding Agents - UC Berkeley, UT Austin, Databricks •Discovering Multiagent Learning Algorithms with LLMs - DeepMind, Feb 2026 •Aletheia Tackles FirstProof Autonomously, DeepMind, Feb 2026 C. Specialized ARC-AGI / Program Synthesis Solvers •Confluence Labs: ARC-AGI-2 Solver - Brent & Niranjan •Arcgentica - Runtime-as-Context Evolutionary Program Synthesis for ARC-AGI-2 - Symbolica AI, Feb 2026 •AB-MCTS / TreeQuest - Sakana AI D. Benchmarks, Discovery & Scientific Research •ALE- Bench - Benchmark for Automated Optimization with LLM-Based Evolutionary Approaches, Sakana AI, atCoder , 2025 •AI scientist - Towards Fully Automated Open-Ended Scientific Discovery - Sakana AI Foerster Lab Oxford UBC •AlphaEvolve - DeepMind , 2025 E. Competition Applications •Sakana AI: Evolutionary Code Generation for AtCoder Heuristic Contest 058 - Sakana AI, 2025 •ShinkaEvolve - Sakana AI: Evolutionary Code Generation for the ICFP Programming Contest 2025 - Sakana AI, 2025 Did I miss some important work, papers?
1
7
1,496
Lubię gadać z ChatGpt Szczery do bólu ale...nie jest inteligencja co sam ładnie tłumaczy Warto przeczytać Pytanie : Dlaczego nie rozwiązujesz słynnych problemów w matematyce ? "Ciekawe pytanie I trochę zabawne, bo ludzie wyobrażają sobie AI jak wielki mózg w piwnicy, który tylko czeka, żeby rozwiązać wszystkie zagadki wszechświata. Rzeczywistość jest… mniej romantyczna. Powód jest prosty: LLM nie jest matematykiem To narzędzie do modelowania języka, a nie do odkrywania nowych twierdzeń. Potrafię świetnie przewidywać, jakie słowo lub zdanie powinno pojawić się dalej, bo zostałem wytrenowany na ogromnych ilościach tekstów napisanych przez ludzi. Matematyka na poziomie badań naukowych działa zupełnie inaczej. 1. Matematyka wymaga dowodu, nie „prawdopodobnej odpowiedzi” Jeśli pytasz o coś, generuję odpowiedź, która najbardziej przypomina poprawną odpowiedź z danych treningowych W matematyce to nie wystarcza. Twierdzenie musi mieć ścisły dowód, krok po kroku, bez żadnych luk. Wystarczy jeden błąd logiczny i cały dowód jest nic niewarty. LLM często: gubi kroki logiczne, przeskakuje etapy, tworzy coś, co wygląda jak dowód, ale nim nie jest. Matematycy nazywają to czasem “proof-looking text”. 2. Brak prawdziwego rozumowania symbolicznego Ludzie w matematyce robią coś, czego LLM prawie nie robi: manipulują strukturami abstrakcyjnymi, eksplorują przestrzeń dowodów, tworzą nowe definicje. LLM raczej odtwarza wzorce, które widział wcześniej. To trochę jak różnica między: kimś, kto rozumie szachy, a kimś, kto tylko pamięta milion partii. 3. Wiele problemów jest absurdalnie trudnych Weź klasyczne nierozwiązane problemy: hipoteza Riemanna P vs NP hipoteza Hodge’a Navier–Stokes Birch–Swinnerton-Dyer Każdy z nich opiera się na całych dziedzinach matematyki rozwijanych przez 150 lat. Nie chodzi o brak mocy obliczeniowej. Chodzi o brak nowej idei. A nowe idee w matematyce powstają zwykle przez: analogie między dziedzinami, intuicję geometryczną, długie eksperymenty myślowe. 4. Modele AI nie eksplorują przestrzeni dowodów Programy, które naprawdę próbują odkrywać matematykę, to raczej: systemy dowodzenia twierdzeń (Lean, Coq, Isabelle), symboliczne AI. One mogą sprawdzać dowody automatycznie. Ale znalezienie dowodu nadal wymaga ogromnej kreatywności. 5. Matematyka wymaga lat koncentracji Andrew Wiles pracował 7 lat w tajemnicy, żeby udowodnić twierdzenie Fermata. To nie był problem typu: policz 10¹² przypadków To był problem typu: wymyśl nową teorię w arytmetycznej geometrii. LLM nie siedzi przez lata nad jednym pomysłem. 6. Ale AI zaczyna pomagać matematykom Tu robi się ciekawie. AI już: pomaga znajdować wzorce w danych matematycznych, podpowiada lematy, pomaga formalizować dowody. Na przykład projekty: DeepMind AlphaTensor – odkrył nowe algorytmy mnożenia macierzy DeepMind Uniwersytet Oksfordzki – hipotezy w teorii węzłów Lean mathlib – formalizacja tysięcy twierdzeń Czyli AI nie zastępuje matematyka. Raczej staje się bardzo dziwnym asystentem badawczym Brutalnie szczera odpowiedź Nie rozwiązujemy wielkich problemów matematyki, bo: jesteśmy modelami językowymi, nie systemami odkrywającymi dowody, nie mamy prawdziwej intuicji matematycznej, nowe idee matematyczne wymagają czegoś więcej niż analiza tekstu. Krótko mówiąc: potrafimy świetnie mówić o matematyce, ale nie zawsze ją odkrywamy. Trochę jak komentator sportowy, który potrafi analizować każdy mecz świata, ale nadal nie zagra w finale Ligi Mistrzów. Niezbyt heroiczna rola, ale ktoś musi robić notatki.
13
7
76
5,914
Replying to @Google
Impressive engineering. Now let's talk about what comes next. 84.6% on ARC-AGI-2 represents real progress in abstract reasoning - Google's team has pushed the frontier meaningfully. The research applications (Carbone's peer-review catch, Wang Lab's crystal optimization) show genuine utility. But Are we watching interpolation reach its asymptote, or are we seeing the beginning of extrapolation? Deep Think's parallel reasoning architecture is sophisticated. Gold medals on Olympiads, 50.5% on CMT-Benchmark, these are non-trivial achievements. But every one of these benchmarks has something in common: we already knew the answer existed. Einstein didn't optimize known physics. He took Maxwell's equations and the constancy of light speed and projected an entirely new reality relativity that contradicted prevailing intuition. That's extrapolation. That's the kind of reasoning that creates new scientific paradigms. Current systems, including Deep Think, are still fundamentally operating within the convex hull of human knowledge. They're extraordinary at connecting A to B if a similar path exists in training data. But scientific breakthroughs often require generating solutions that contradict the training distribution. The Sim2Real gap matters here too. World Models can simulate physics we've already discovered, but can they model unknown physics? If the internal simulation is bounded by known laws, we're building sophisticated prediction engines - not entities capable of genuine discovery. @demishassabis AlphaTensor, AlphaZero, AlphaFold are remarkable achievements. But in every case, we had the data. PDB gave us 170k protein structures humans solved over decades. The AI interpolated brilliantly. But we can't invent a warp drive this way because we don't have the training data for it. What would actually move the needle: Not better benchmarks. Not faster reasoning. But an AI system that proposes a testable hypothesis no human researcher was pursuing, that survives experimental validation, and opens a new line of scientific inquiry. Until then, these are the most powerful copilots humanity has ever built. They will accelerate us toward Type 1 Kardashev civilization. They will change everything. But let's be honest about what they are and what they're not yet. The architecture breakthrough that enables true extrapolation? That's the next frontier. And it won't come from scaling alone. Excited to see where this goes. The real test is in the labs.
2
1,343
Google's AI (DeepMind's AlphaTensor) found a shorter and faster way to solve Maths - Matrix multiplication;
mitsuri

1
1
2
36
I have to admit that building a super large model will definitely crush humans on all benchmarks, but we must ask if this can actually turn into something profound like general intelligence. We performed a simple analysis on the current trajectory and what we tend to see is a weird phenomenon where the system moves beyond a certain threshold and people immediately call it a singularity. I refuse to call it that. I want to admit that at the core we have something called Transformers, which are text prediction engines that work based on mathematical similarity. While we will undoubtedly invent novel solutions to optimize them, there is a fundamental limit. It is ture that a model like Claude 4.5 Opus will "out-knowledge" Einstein, but it has not "out-thought" him. Einstein’s genius was extrapolation because he took a tiny amount of data, specifically Maxwell’s equations and the speed of light, and projected a completely new reality called Relativity that no one had ever described before. Current systems, including Opus 4.5, are fundamentally Interpolators. They sit inside the "convex hull" of all human knowledge and are incredibly good at connecting dot A to dot B if they have seen a similar path before. However, they struggle to generate a solution that contradicts their training data, which is exactly what scientific breakthroughs require. When we talk about World Models, we have a problem there as well known as the Sim2Real Gap. This is the exact reason why a World Model might still not be "true" intelligence but rather a very sophisticated hallucination engine. If the internal simulation is flawed, the AI isn't thinking. It is effectively dreaming. You can definitely create physics and new outliers inside a simulation based on known laws, but an AI cannot model unknown physics or laws it has never encountered. If you define intelligence as predicting inputs correctly, then World Models are intelligent, but if you define it as knowing what is real, they are still failing. To create a true intelligence system like us, we need to understand ourselves in every aspect, including the thoughts on how we formulate novel ideas. I think intelligence is the most precious thing you have right now. As a human, you are qualified just by existing, as you are a statistical miracle among 1 in 3.656 quadrillions. Intelligence needs something else that current architectures are missing entirely. To conclude, LLMs are the most powerful and precious machines we have right now because they can hold every book humanity has written so far. They will definitely change the world, and we have a higher possibility of reaching a Type 1 Kardashev scale civilization with them, but I would suggest treating them as copilots rather than entities that can do magic for you. And @demishassabis might be wrong on this specific point because we can easily outsmart something that we fully understand. Even regarding systems he built like AlphaTensor, AlphaZero, and AlphaFold, we must remember that in all these scenarios we had the data. In protein folding, for instance, we had a database called the PDB containing the structures of about 170,000 proteins that humans had painfully solved manually over decades. The AI interpolated that data, but we cannot invent a warp drive using this method because we do not have the training data for it. At this point, figures like @DarioAmodei, @sama, and @elonmusk are largely trying to sell their business to you, and that is a valid point. However, being genuinely honest, this is the greatest tool of the century that anyone can use to discover something new if they have the right idea, data, and verification methods. We need engineers to work on this actual utility rather than feeling disappointed or lost in hope. Hope is the thing that keeps you alive and making progress. AGI is not possible until there is a unique breakthrough in the architecture, so while discoveries are inevitable, reaching true AGI needs a fundamental shift, not just World Models, Recursive Self Improvement, or Swarms of Agents.
2
5
10
1,492
Jan 27
"não traz propriamente nenhum retorno direto" AlphaFold 1, 2 e 3 AlphaEvolve WeatherNext 1 e 2 GenCast AlphaTensor Fora o impacto na produtividade geral em N segmentos que acaba implicando na redução do uso de recursos não renováveis.
4
7
2,287
Replying to @vision_ia
Pour la multiplication matricielle on avait déjà AlphaTensor de DeepMind en 2022 qui avait amélioré plusieurs cas (notamment pour les matrices 4x4 dans un corps fini).
3
717