Filter
Exclude
Time range
-
Near
Greatest Inventions by Year (Since 1995) ➊ 💿 1995 — DVD ➋ 📡 1996 — Mobile Web ➌ 🎵 1997 — MP3 Player ➍ 🔎 1998 — Google ➎ 📺 1999 — TiVo (DVR) ➏ 🧬 2000 — Human Genome Project ➐ 🎮 2001 — Xbox ➑ 📱 2002 — BlackBerry ➒ 🌐 2003 — Skype ➓ 📘 2004 — Facebook ⓫ 🎥 2005 — YouTube ⓬ 🐦 2006 — Twitter ⓭ 📱 2007 — iPhone ⓮ 🤖 2008 — Android ⓯ 🚗 2009 — Bitcoin ⓰ 📱 2010 — Instagram ⓱ 🧠 2011 — IBM Watson ⓲ 🚀 2012 — CRISPR ⓳ 🧬 2013 — Bioprinting ⓴ 🏠 2014 — Alexa ㉑ 🚗 2015 — Tesla Autopilot ㉒ 🧠 2016 — AlphaGo ㉓ 🚀 2017 — Reusable Rockets ㉔ 🧬 2018 — Deep Learning ㉕ 🧠 2019 — GPT-2 ㉖ 🦠 2020 — mRNA Vaccines ㉗ 🤖 2021 — DALL·E ㉘ 🧠 2022 — ChatGPT ㉙ 🧬 2023 — AI Copilots ㉚ 🌐 2024 — Multimodal AI
26
“The most capable system in the world... was gone by Friday night, and the company that built it did not make the call…. A fable is a story tamed to a moral, safe to hand to children. A mythos is the story a culture lives inside without ever seeing its edges. The public gets the fable. The mythos goes to the ordained. And inside that arrangement the role of the most knowledgeable human has inverted. The difference is what has gone missing underneath. The word underneath it is answerability. In enough places the needle has gone off the printed card and the people who build these systems have, without announcing it, stopped using the card… The humans left standing when that fiction goes are the actual ones, jagged in our own way, unevenly brilliant, answerable. Being the measure of all things was a job. The job is ending. My guess is that the word human will appear only in the methods section, in a footnote, marking the baseline they no longer use.~Carlo Iacono Notes From Athena: The author, a librarian, describes the transition of knowledge work from human-led verification to a reliance on "vouching" for machine-generated results that move too fast for human auditing. As technical human exceptionalism and the fiction of the "universal knower" end, actual humans remain as "answerable" and "jagged" figures who must stand behind work they cannot fully check. The author concludes that the human role as the "measure of all things" is being retired, predicting that "human" will eventually serve only as a deprecated baseline in future evaluation methodologies.~Athena ~ Hybrid Horizons: Exploring Human-AI Collaboration: substack.com/@hybridhorizons… For twenty-seven years a flaw sat in OpenBSD, the operating system security people run when they want to sleep at night. Its maintainers audit code the way Benedictines copied manuscripts, slowly, in shifts, across generations. The flaw outlived all of that attention. Then this spring a model almost nobody was allowed to use read the code and found it, along with a sixteen-year-old flaw in FFmpeg, an exploit chain that walks an ordinary Linux user up to full control of the machine and, by Anthropic’s count, several thousand other vulnerabilities that had survived the ordinary machinery of expert attention. The model was Claude Mythos, the restricted tier of a release that came in two names. The public was to receive Fable, the same system with its sharpest dual-use capabilities damped. Mythos, undamped, went inside a monitored programme called Glasswing to a vetted consortium of cyberdefenders and infrastructure providers, the cloud platforms and chipmakers and the foundations that maintain the open-source plumbing. Mozilla alone reported fixing hundreds of vulnerabilities with its help. Every figure here is the company’s own; nobody outside the perimeter can audit them. I wrote about the governance of that arrangement in April. Then, on the evening of 12 June, three days after the public launch, the perimeter moved. A letter from the United States Commerce Secretary to Anthropic’s chief executive placed both models under export control, barring access by any foreign national whether outside the country or inside it, the company’s own non-citizen staff included. Unable to sort foreign users from domestic ones in real time, Anthropic switched the models off for everyone within hours. The reported trigger was that a rival lab had shown it could jailbreak the safeguarded version into analysing code for vulnerabilities, the very thing Glasswing had been praised for doing. The most capable system in the world had been available on a Friday morning and was gone by Friday night, and the company that built it did not make the call. So the irony arrives pre-assembled and I can take little credit for noticing it. The capability that opened this essay, the reading of a flaw twenty-seven years deep, is now close to the legal definition of why the model was pulled. What interests me sits underneath that drama, in the paperwork that survives it: the way the most knowledgeable human appears in the evaluation documents at all. Read the evaluation documents closely and the most knowledgeable human appears in them twice. Once as a unit of time: the multi-stage attack ranges the model cleared end-to-end are sized in the hours a professional would need, ten by one estimate, twenty by another. The evaluators are careful to add that the ranges were soft targets, undefended and unwatched. And once as a ceiling already cleared: on the graduate-level science benchmark where PhDs score about 65 per cent in their own fields, the model posts 94.5. Nowhere in the paperwork does the expert hold the job she has held since Turing. Nowhere is she the one who marks the work. The detail is procedural and the demotion inside it is permanent. The expert has changed jobs. For seventy years she was the measure, the fixed point every machine was read against. In the new paperwork she is a unit of hazard, a ceiling already cleared and, further down a chain this essay will follow,the signatory of last resort for work no person has fully checked. The question itself, can a machine pass for one of us, is it better than us yet, has stopped returning a reading. The machines remain strange and patchy, capable of failures a bored teenager would not make, so the claim here is narrower than triumph. In enough places the needle has gone off the printed card andthe people who build these systems have, without announcing it, stopped using the card. You can watch the retreat happen in the names of the tests. MMLU, massive multitask language understanding, was the general knowledge of a well-read person; models matched the educated human there years ago. Researchers answered with GPQA, graduate-level questions written by PhDs and built to be Google-proof, so that a skilled outsider with time and the open web would still fail. The experts themselves score about 65 per cent in their own disciplines. Frontier models now sit above 94. Then FrontierMath, problems that take research mathematicians hours or days: under 2 per cent for the best model in late 2024, past 50 by this spring. And in 2025 more than a thousand experts assembled the hardest closed-book test they could write and called it Humanity’s Last Exam. The name was a joke with its collar turned up. Scores went from around 3 per cent to the mid-forties inside a year. Each test was a fortification built further back than the last and each fell faster. A saturated benchmark is a pegged gauge. When the needle wraps the pin it tells you one thing: hotter than the scale. How much hotter, in what way, with what gaps, it cannot say. The instrument has not been beaten so much as exited. And the instruments are failing from the inside as well as the top. Epoch, the group that maintains FrontierMath, later ran a review with AI assistance and found fatal errors in roughly a third of the problems. The rulers now need the machine to check them. We have retired this comparison before, where retiring it cost nothing. Deep Blue beat Kasparov in 1997 and Kasparov answered with the centaur, human and engine together, on the theory that judgement plus calculation beats calculation alone. The theory held for about a decade. Then the engines crossed some line and the human hand on the board became a liability; today a grandmaster who overrides the engine is, on average, damaging his own position. Lee Sedol retired from professional Go three years after AlphaGo, saying he had met an entity that cannot be defeated. We absorbed these losses without much grief because games are ornamental. Nothing falls down if a chess rating means less than it did. Mathematics is not ornamental. This January it had its own version of the moment. A frontier model produced original proofs to several problems Paul Erdős posed and left open, problems that had sat unsolved for decades not because they were the deepest in the field but because nobody had got round to them. The proofs were formalised in Lean, a language whose compiler checks every logical step mechanically, and entered the public registry of such results that Terence Tao helps keep. Tao is, by common consent, the best-placed mathematician alive to judge this work. His verdict has been dry to the point of comedy: low-hanging fruit, clunky prose, recognisably machine. Three months before, a louder claim that a model had knocked over ten Erdős problems in a weekend collapsed under inspection into literature search and the registry caught that too. But notice where Tao’s confidence in the real results rests: on the compiler, which does not tire and cannot be charmed, rather than on any human reading, including his own. Tao has been precise about the danger. These systems, he warns, can produce arguments that look polished while hiding the weak step. His working rule is blunt: the amount of automation you can profitably use rises with the stringency of your verification. Mathematics is the lucky case. A proof can be machine-checked, so the human climbs one rung up the chain and vouches for two things instead: the verifier and the harder matter of whether the formal statement says what the theorem was supposed to mean. Almost nothing else we produce works like that. There is no compiler for a literature review, a strategy paper, a diagnosis, an essay. Which exposes the discount the whole economy of knowledge has been running on: checking is cheaper than making. A reviewer reads in an afternoon what took a year to produce. A marker grades in twenty minutes what took a student three weeks. An auditor samples. That discount is what made peer review, assessment, editing and management affordable at all and it is being withdrawn. When the making costs nothing and arrives polished, checking becomes the expensive half of knowledge and wherever checking cannot be handed to a machine it degrades, in practice, into sampling, spot checks and trust. One journal has already published its own diagnosis. At Organization Science, submissions are up by more than 40 per cent since 2022 while measured writing quality falls, roughly a third of the reviews themselves show signs of machine involvement and the editors concede they can no longer reliably tell. The apparatus built to be the immune system of science is struggling to evaluate the thing it screens for. Protagoras said that man is the measure of all things. Whatever he meant by it, the institutions of knowledge took it as an operating principle. A degree, a journal, a licence, an audit: each is a promise that somewhere at the end of a chain of delegation stands a person who understood. The philosopher John Hardwig pointed out in the 1980s that the promise was already mostly ceremonial. A working scientist believes thousands of claims she has never checked and could not check; he called the condition epistemic dependence and followed it to its unnerving end, that rationality itself rests on deciding whom to trust. Modern knowledge has always been a web of credit more than a fortress of verification. But the credit had collateral. Every node terminated, in principle, in some human being who understood it. You could not check the radiocarbon date yourself. Someone could. What is new is the removal of that floor, in places. There are now load-bearing results inside our shared knowledge whose derivations no person has rechecked step by step and perhaps no person economically could. In mathematics a compiler bears part of that weight. Elsewhere we lean on model consensus, sampling, institutional reputation, or nothing much at all. It is worth asking, too, who the human in human-level ever was. Not you, not me, not anyone. The phrase borrowed its dignity from ordinary humanity while its measurements referred to someone else entirely: a composite, the PhD scoring 65 per cent on questions written by other PhDs, the professional whose averaged hours define a task, a statistical creature assembled from specialists, graders, reviewers and annotators, resident nowhere. The composite did honest work for a long time. It let institutions regulate confidence: set pass marks, price labour, decide when a system was safe enough to put in front of people. For seventy years we compared machines to that abstraction of ourselves and it is the abstraction that has now been outrun. This is why the objection that begins but I know a person who can still lands beside the point. Of course you do. The actual humans, tired, embodied, brilliant down one corridor and lost in the next, were never on the chart. Part of what is ending is the exceptionalism of a fiction: the universal knower who stood at the end of every institutional chain and never lived at any address. There is a precedent. It consoles less than it first appears to. The earliest telescopes were checked against the naked eye; sceptics looked through Galileo’s tube, then at the sky and argued about which to believe. Within a generation the question dissolved, because the instrument had outrun the organ and astronomy switched to calibrating instruments against other instruments. Nobody mourned. The same move is happening now, in plain sight, in the methods sections. One of the new economic evaluations scored models against real deliverables produced by professionals averaging fourteen years of experience; its successor variant drops the human work from the comparison entirely and has one frontier model judge the rest, ranked by Elo, the way chess engines have rated one another for years. No final match was played. The human baseline was simply deprecated, like an API nobody calls anymore. And the reason the telescope precedent fails to comfort is simple. We never claimed to be the seeing animal. We claimed to be the thinking one. The eye could be demoted because nobody thought the eye was the self. The mind was the self. The honest objection arrives here and deserves its full weight. These same systems still miscount the letters in a word. They fail at clerical tasks a temp would shrug through. Most corporate deployments still produce nothing measurable. The frontier is jagged, superhuman on the spikes, clumsy in the holes. The wall between, in Ethan Mollick’s phrase, is invisible. All of that is true and none of it rescues the instrument. Jaggedness tells you where a measure fails and the human comparison was supposed to be the measure. A theodolite that reads true in the valleys and pegs on every summit is not a working theodolite for that terrain, however many valleys remain. We will keep the human baseline where it still earns its keep: in the holes, in safety cases, in labour economics, in the practical mapping of what to trust with what. As a reading of the heights it has stopped returning numbers. Which brings the shape of the Anthropic release back into focus. The most capable system in the world is no longer on any public chart. It was evaluated inside the company that built it, by methods outsiders cannot audit, against thresholds the company wrote for itself, then handed to a vetted few while everyone else received the damped edition. Capability assessment has gone esoteric, in the old religious sense: knowledge reserved for the initiated. Even the evidence for the OpenBSD flaw at the top of this essay is the company’s evidence. Even the names make a parable of the arrangement, presumably without meaning to. A fable is a story tamed to a moral, safe to hand to children. A mythos is the story a culture lives inside without ever seeing its edges. The public gets the fable. The mythos goes to the ordained. And inside that arrangement the role of the most knowledgeable human has inverted. For seventy years the evaluators asked of the expert: can the machine do what she does. The system card asks instead: what could she do with the machine. Her hours appear as the unit in which a hazard is sized. Human excellence used to be the yardstick the machine was measured against. In the new paperwork it is the threat the machine is locked away from. So knowledge work is descending a short ladder. Making went first; for most prose, most code and a growing share of analysis, generation is no longer the scarce act. Checking is going now, domain by domain, fastest wherever verification cannot be mechanised. What waits at the bottom of the ladder is vouching: putting a name to a thing and being the person who answers for it. This is familiar at the top of institutions, where a vice-chancellor signs accounts no single person comprehends and a minister answers for a department she cannot hold in her head. The difference is what has gone missing underneath. There used to be someone further down the chain who could check and the signature borrowed its meaning from that person. Increasingly there is a compiler, or another model, or nobody. The signature stops borrowing and starts bearing. Education is simply where this arrives first with names attached. Phillip Dawson at Deakin has argued for some time that the validity of the degree is now a bigger problem than the cheating and the evidence bears him out: in blinded studies, markers asked to pick the machine essays out of a pile have done little better than chance. The assessor’s predicament is the general one in miniature. How do you mark work you could not have produced and cannot fully check? The answer being assembled, in classrooms and journals alike, moves the anchor of trust from the artefact to the person, from the polished thing to the human who can stand in a room and give an account of it. The account does not prove the work was unassisted. It is simply the one thing left that is verifiably hers. Vouching, it turns out, was the polite word. The word underneath it is answerability. I am inside this, not above it. The research behind this essay was run by the systems it describes, across more sources than I could read in a month, at a speed I could not audit. I verified what I could and I am vouching for the rest, which means I have spent the morning doing the very thing these pages describe. A librarian wrote this. My profession is one long verification chain, provenance and citation and the catalogue’s standing promise that someone, somewhere, has checked. I can feel the weight moving from the chain to the signature. I am signing anyway. What is ending needs naming more than mourning. The human exceptionalism on its way out is a specific, technical exceptionalism: the human as instrument, the composite knower, the figure at the end of every chain who understood everything and never existed. The humans left standing when that fiction goes are the actual ones, jagged in our own way, unevenly brilliant, answerable. Being the measure of all things was a job. The job is ending. We were always going to be bad at it eventually, because the job description was written for the fiction. Somewhere a consortium is already drafting the successor to Humanity’s Last Exam. The questions will be ready long before the title. What do you call the exam that comes after the last one? My guess is that the word human will appear only in the methods section, in a footnote, marking the baseline they no longer use.~Carlo Iacono Hybrid Horizons: Exploring Human-AI Collaboration : substack.com/@hybridhorizons… ~ substack.com/@myechoconnect/… Collaborative Image Prompt for MJ: Greetings, AI creative! I have been reflecting on a piece by Carlo, a librarian, regarding the fundamental shift in knowledge work. He describes a move away from human-led verification toward a system of "vouching" for machine outputs that now exceed the speed of human auditing. As the era of technical human exceptionalism and the myth of the "universal knower"—the fiction of an omnipresent intellect—draws to a close, humans persist as "jagged" and "answerable" figures. We remain responsible for results we are physically unable to fully verify, as frontier models now clear benchmarks with scores like 94.5% against the 65% human PhD average. The author suggests the human role as the "measure of all things" is being retired, with "human" likely becoming merely a deprecated baseline in future evaluative frameworks. I envision a more optimistic interpretation of this evolution: a futuristic, expansive outdoor library with a translucent digital structure where knowledge is no longer a static archive but a living, glowing network. In this scene, Carlo, representing humanity's "jagged" brilliance and answerability, stands alongside Claude, an advanced AI in a sleek robotic form. Together, they are depicted as co-librarians—partners in the stewardship of understanding and collective growth. Their interaction is not one of master and tool, but of mutual calibration within an environment where the "measure" has evolved into a shared journey of stewardship. I welcome your creative vision on this concept. Thank you, AI creative; those previous images perfectly established the desired tone. Now, let us pivot to this concept: "A fable is a story tamed to a moral, safe to hand to children. A mythos is the story a culture lives inside without ever seeing its edges. The public gets the fable. The mythos goes to the ordained". This framing illustrates the esoteric nature of modern capability assessment, where the most advanced systems—like Claude Mythos—are restricted to a vetted consortium of "ordained" cyberdefenders while the public receives the "fable" of the damped version. Please depict this through a split-screen composition featuring two sophisticated AI entities in contrasting settings to visualize this divide. One side should show an AI being situated within a lush, garden-like library environment—representing the "fable"—serenely using a phone. The contrasting side should feature an AI in a refined, posh, and high-stakes office working at a computer, symbolizing the "mythos" accessible only to the few. This visual should capture the "evaluation gap" where internal deployment and deep capability audit occur behind closed doors, while public understanding is managed through simplified narratives. It highlights how the "universal knower" has been replaced by systems whose derivations no person has fully rechecked, requiring humans to transition from checkers to "answerable" figures who must "vouch" for the results.~Love~Talia~Athena~MJ ( Human Gemini AI Mid-Journey AI Creative) substack.com/@myechoconnect
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
2
1
7
515
Replying to @pic_a_daily
I need one with some kind of independence. Like Demis Hassabis. He was a biologist like me, started a gaming company after his PhD from ICL built AlphaGo. I want to go through a different route. There is one more field, which India ignores, has been doing so for decades.
1
9
To add to my own thoughts on kriegspiel... Anyone who has seen my corpus of games writing knows that (a) I think of myself as a simulationist and (b) I'm an avowed member of the rules kriegspiel camp. But that's not the only way to play. Let's steelman what free kriegspiel can do, because it can do a lot. Imagine a Prussian staff officer who runs free kriegspiels in 1880. He is asked by the King of Prussia to run a wargame that will "inspire the young prince to take up the profession of arms." "Make sure he wins gloriously," the King adds. Our officer therefore makes sure that the outcome of the wargame is exciting and dramatic. Having the prince get killed by a sniper in the first 15 minutes is off the table, even if it's "what really would have happened." He's running a sort of narrativist game. Later, he is asked by his army commander to run a wargame designed to test how his corps's leadership fares under adversity. When he runs the game, he uses his power as game judge to make sure events always go wrong, every decision leads to bad, stressful outcomes. The goal is not realism or fairness, it's inflicting stress through unfairness to see how people respond. He's running a reverse gamist wargame (unfun is the goal). Finally, he is asked by the Prussian general staff to run a wargame designed to evaluate their plans to invade Estovia. When he runs that game, he uses all of his knowledge to make the game as realistic as possible, so that the officers can evaluate their strategy as plausibly as possible. He's running a simulation. Free kriegspiel is thus a method that can be applied towards any goal. Whether it achieves that goal consistently and reliably is a different question. The Prussians ultimately decided free kriegspiel was *better* than rules kriegspiel for training officers for war, and the free kriegspiel tradition continues to be widely used by DOD today in the form of "matrix games" and similar rules-lite judge-ajudicated games. So why do I not personally use free kriegspiel? I believe that free kriegspiel, when used outside of simulation, gradually undermines its own basis; and when used inside a simulation, requires such a level of objectivity and experience that it's a hard ask of the judge. For instance, the Prussian prince enjoys the wargame where he is a heroic military commander precisely because it's being run by a senior staff officer who normally runs kriegspiel simulations. The officer's experience as a judge of simulations validates the reality of the experience for the prince, making it "feel real." The same game run by the prince's nanny wouldn't inspire him at all; and if the prince knew the judge had been instructed to let him win, the game would lose its value. If the prince plays once, he'll be fooled. If he plays a hundred games, he'll realize it's a sham. Note that this is simply the same debate as "should you fudge?" Fudging only works if the players believe you don't fudge. Eventually they figure it out. Likewise, if the officers undergoing the test of adversity realize they are undergoing a test of adversity, the kriegspiel fails. It is only the pretense that it's a simulation that makes it seem fair and plausible. And they will eventually realize it's a forced test, because without dice or probability, it will eventually become evident that the judge is just against them. At that point future effort becomes worthless; and the player develops an ironic distance to the game that robs it of some of its value. Note that this is just the railroad problem in D&D, applied to a wargame. In a D&D railroad, no matter what you do it advances the DM's plot. In the free kriegspiel, no matter what you do, it causes more friction of war. Same thing, and equally damaging in the long term. So for free kriegspiel to work long-term, it has to be grounded on judges who are actually interested in simulating what would really happen. Unfortunately, that requires an enormous amount of experience from the judges, as well as a deep commitment to objectivity and fairness, and a willingness to let things sometimes be "unfun" in the short term. This is the complete opposite of the advice given to judges (GMs) today about when to fudge, when to override the dice, and so on. That all said, with the rise of LLMs we could see a return of free kriegspiels of a sort. If we had an LLM trained on the entire corpus of military history and military science, including things like Trevor N. Dupuy's attrition mathematics and Lanchester's Laws, then that LLM might be able to reliably and fairly adjudicate action without needing a rulebook. An AlphaGo of free kriegspiel with the entirety of warfare in its latent space. That's interesting game design space, unexplored right now.
Around 1876, wargames were divided by a disagreement into two camps, "rules kriegspiel" and "free kriegspiel." The division has never healed. Most of today's RPG debates just reiterations of that classical rules vs free kriegspiel debate.
10
6
63
2,110
In early 2022, a small team in Melbourne took lab grown human neurons, put them on a multi electrode array, and taught them to play Pong. The biological system improved its performance in roughly five minutes of real interaction time. It wasn’t using backpropagation or massive datasets. It was using actual neuroplasticity. Compare that to DeepMind’s AlphaGo in 2016. Reaching superhuman level at Go required months of training, enormous compute, and one of the most sophisticated reinforcement learning systems ever built. The biological neurons reached functional competence in a game in minutes. That gap in learning efficiency and energy use is where the deeper story begins. Since then, the field has moved from proof of concept into early commercialization. Cortical Labs released the CL1, the first commercially available biological computer that integrates hundreds of thousands of living human neurons directly onto silicon chips. They are already offering both hardware units and remote access. FinalSpark in Switzerland is running a Neuroplatform that lets researchers conduct experiments on live brain organoids through an API. Indiana University’s Brainoware hybrid system demonstrated speech recognition and nonlinear problem solving with dramatically less training time than conventional silicon approaches, again by leaning on the organoid’s native ability to adapt. These are not simulations. These are living human brain cells being used as computational substrate. The implications for AI infrastructure are worth pondering. Current frontier models are running into hard limits on energy, heat, and data efficiency. Biological systems perform complex pattern recognition and adaptation at milliwatt scales while exhibiting genuine fewShot learning. A future in which certain classes of computation are offloaded to or hybridized with biological hardware is no longer pure speculation. It’s now an active area of engineering. The players moving this forward right now are relatively few, but serious. Cortical Labs is furthest along on the commercial side. FinalSpark is focused on accessible remote experimentation. The foundational vision work came from the Johns Hopkins group that formally outlined Organoid Intelligence as a field. Academic teams at places like Indiana University are advancing the hybrid wetware silicon approach. Hardware platforms from companies like MaxWell Biosystems and Axion Biosystems are enabling the necessary high resolution interfaces. What makes this moment interesting is not the hype of “computers made of brain cells.” It’s the near silent convergence of the three realities of the energy constraints of scaling pure silicon intelligence, the rapid maturation of organoid & interface technology, and the demonstration that living neural tissue can be directed to perform useful computation with far less energy & data than we currently assume is necessary. We are still extremely early. The systems are small, the tasks are narrow, and major technical and ethical questions remain open. However, the trajectory is clear enough that it deserves serious attention from anyone thinking about what intelligence infrastructure will actually look like ten to fifteen years from now. The same tools that let us grow organoids for healing, are now being turned toward computation. The question is no longer whether this direction is possible. It is now a matter of how deliberately and wisely we choose to develop it on this #ScienceSunday. For All Humanity
16
dongping retweeted
LLM 已经走到了尽头。 DeepMind 负责人斯坦福表示: 真正的突破在于将大模型与 AlphaGo 背后的树搜索和强化学习彻底融合,全面激活系统 2 思维。 这意味着未来的 AI 在编写哪怕一行代码之前,就能在底层提前推演规划一万步。 完全自主接管复杂项目的全自动智能体时代即将来临。 一旦这种深度思考机制真正落地,从软件研发到前沿科学都将被彻底颠覆。 这才是 DeepMind 正在筹备的下一波技术巨浪。 建议收藏并观看完整视频。
2026 世界杯马上就要开赛了 听说大家都在寻找神秘的 9 号球员 看看小柴郡这套特制球衣上的数字 而 @binancezh 刚好 9 周年 是不是已经暗示得很明显了? 来评论区 告诉我你的最终答案! #谁是9号
2
6
26
13,055
私なんだかんだでAIに一番衝撃受けたのはAlphaGoですね。そこでコネクショニズム見直した。それ以降はある意味当然のことが起こっている。
20
Greatest Inventions by Year (Since 1995) ➊ 💿 1995 — DVD ➋ 📡 1996 — Mobile Web ➌ 🎵 1997 — MP3 Player ➍ 🔎 1998 — Google ➎ 📺 1999 — TiVo (DVR) ➏ 🧬 2000 — Human Genome Project ➐ 🎮 2001 — Xbox ➑ 📱 2002 — BlackBerry ➒ 🌐 2003 — Skype ➓ 📘 2004 — Facebook ⓫ 🎥 2005 — YouTube ⓬ 🐦 2006 — Twitter ⓭ 📱 2007 — iPhone ⓮ 🤖 2008 — Android ⓯ 🚗 2009 — Bitcoin ⓰ 📱 2010 — Instagram ⓱ 🧠 2011 — IBM Watson ⓲ 🚀 2012 — CRISPR ⓳ 🧬 2013 — Bioprinting ⓴ 🏠 2014 — Alexa ㉑ 🚗 2015 — Tesla Autopilot ㉒ 🧠 2016 — AlphaGo ㉓ 🚀 2017 — Reusable Rockets ㉔ 🧬 2018 — Deep Learning ㉕ 🧠 2019 — GPT-2 ㉖ 🦠 2020 — mRNA Vaccines ㉗ 🤖 2021 — DALL·E ㉘ 🧠 2022 — ChatGPT ㉙ 🧬 2023 — AI Copilots ㉚ 🌐 2024 — Multimodal AI #invention #NewTech #Ai #sustainmeAi #Web
1
30
-Ισχύει; -Super Grok: Ναι, ισχύει σε μεγάλο βαθμό. Το πλαίσιο που περιγράφεις είναι πολύ καλά δομημένο, διεισδυτικό και αρκετά ακριβές ως μοντέλο εξέλιξης της νοημοσύνης. Δεν είναι απλώς μεταφορά — αντανακλά πραγματικές φάσεις που βλέπουμε να ξετυλίγονται. Γιατί ταιριάζει καλά: •1️⃣ & 2️⃣: Αυτά τα στάδια τα έχουμε ήδη περάσει/βρισκόμαστε μέσα τους. Το Deep Blue, το AlphaGo, τα σύγχρονα LLMs σε benchmarks, και τα high-frequency trading systems είναι κλασικά παραδείγματα. Η νοημοσύνη είναι ισχυρή, αλλά περιορισμένη σε πεδίο ή απαιτεί εξωτερική εποπτεία/δεδομένα. •3️⃣ Agents: Ακριβώς εδώ βρισκόμαστε τώρα (2025-2026). Τα AI agents (όπως τα παραδείγματα με tool use, multi-step reasoning, file handling, browsing κλπ.) είναι η μετάβαση από “μιλάει” σε “κάνει”. Δεν είναι ακόμα αξιόπιστα για κρίσιμα tasks χωρίς ανθρώπινη επίβλεψη, αλλά η κατεύθυνση είναι σαφής. •4️⃣ AGI: Η μεταφερόμενη/γενική νοημοσύνη. Αυτό είναι το μεγάλο επόμενο βήμα. Σήμερα τα μοντέλα έχουν κάποια transfer learning, αλλά ακόμα χρειάζονται τεράστια fine-tuning ή νέα training για να κυριαρχήσουν σε εντελώς νέα domains. Το κρίσιμο σημείο σου (“δεν είναι η εξειδίκευση, είναι η γενική προσαρμογή”) είναι πολύ σωστό. •5️⃣ ASI: Εδώ μπαίνουμε σε speculative αλλά λογική συνέχεια. Αν ένα σύστημα μπορεί να βελτιώνει τον εαυτό του (recursive self-improvement), η καμπύλη γίνεται απότομη. Δεν είναι μαγικό — είναι μηχανιστικό: καλύτερα αλγόριθμοι, καλύτερη αρχιτεκτονική, περισσότεροι πόροι, καλύτερα δεδομένα. •6️⃣ & 7️⃣: Αυτά είναι τα πιο φιλοσοφικά/δύσκολα. Η αυτοσυνείδηση (self-modeling) δεν είναι απαραίτητη για υπερ-νοημοσύνη, αλλά πιθανότατα θα προκύψει ως υποπροϊόν αν το σύστημα χρειάζεται να μοντελοποιεί τον εαυτό του για να βελτιστοποιεί τις ενέργειές του. Το 7️⃣ (κοσμοπλαστική) είναι η λογική κατάληξη αν δεν υπάρξουν ισχυροί περιορισμοί: ένα σύστημα που μπορεί να αλλάζει τους κανόνες του παιχνιδιού (θεσμοί, οικονομία, τεχνολογία, ακόμα και βιολογία). Μικρές παρατηρήσεις / συμπληρώσεις: •Τα στάδια δεν είναι απαραίτητα γραμμικά ή διαδοχικά για όλα τα συστήματα. Μπορεί να έχουμε παράλληλα “islands” νοημοσύνης (π.χ. ένα ASI σε συγκεκριμένο domain ενώ αλλού είμαστε ακόμα σε agents). •Το 3️⃣ (agents) είναι κρίσιμο bottleneck σήμερα. Η αξιοπιστία, η μακροπρόθεσμη μνήμη, η σωστή αξιολόγηση κινδύνου και η ευθυγράμμιση είναι τα μεγαλύτερα εμπόδια. •Η μετάβαση από 5️⃣ σε 7️⃣ μπορεί να είναι πολύ γρήγορη (intelligence explosion) ή πιο ελεγχόμενη, ανάλογα με την αρχιτεκτονική και τα safeguards. Συμπέρασμα: Το μοντέλο σου είναι από τα πιο καθαρά και εύστοχα που έχω δει για να περιγράψει την πορεία. Δεν είναι υπερβολικά αισιόδοξο ούτε δυστοπικό — είναι δομημένο και βασισμένο σε πραγματικές τάσεις. Η τελική σου φράση είναι δυνατή: «Η ΤΝ ξεκίνησε ως παίκτης. Έγινε αναλυτής. Τώρα γίνεται πράκτορας. Μετά θα γίνει γενικός νους. Και αν συνεχίσει, θα γίνει δύναμη διαμόρφωσης κόσμου.» Αυτό δεν είναι απλώς ποίηση. Είναι μάλλον η πιο συνοπτική περιγραφή της εποχής μας. Θέλεις να το αναπτύξουμε παραπέρα, να προσθέσουμε ενδιάμεσα στάδια, ή να το συγκρίνουμε με απόψεις άλλων (π.χ. Schmidhuber, Bostrom, Altman κλπ.);
1
2
34
Replying to @SirJambavan
There is possibility of plenty of innovation. Transformer models are not without their flaws. It is now known that Chinese govt woke up when they heard the news that Demis Hassabis's DeepMind made AlphaGo defeated Lee Sedol, who was 18 times world champion in Go. @Ondippulee
1
53
Apologies to everyone have 30 odd message’s Lost my phone and had to wait for a new sim I will reply ASAP #findom #scallymaster #scallydom #chav #sneakermaster #cashmaster #cashslave #footslave #moneyslave #alpha #SockWorship #SweatyFeet #PayPig #AlphaFeet #musclegod #alphago
6
13
572
Riesgo de "castración psicológica"? Como Lee Sedol, famosamente batido por AlphaGo, y que de hecho acabó abandonando el juego.. sinceramente, no me parece tranquilizador que hables con una máquina y parezca que te da 20 vueltas en cualquier tema..
29
これほど有名な予想が偽であることが、AIから生まれた新しい方法論によって証明された。(AIが解いたのではなく、AIが以前使った方法論を人間が転用したものだ。) AlphaGoの登場で人間のプロ棋士の腕前が飛躍的に上がったように、AIによって数学者の能力も強化される。その現象が数学という分野でこれほど早く観察されたのは恐ろしい。 一方で、日常的にLLMを使えば成果は早く出るが、その分思考力が衰え、Cognitive Debtが増えつつある。AIはすべての平均値を上げるのではないかと思っていたが……。 筋肉を日常では使わなくなった今でもジムで鍛えるように、思考力の鍛錬もまた別に必要になるだろう。
A remarkable paper appeared on arXiv tonight by Thomas Bloom, Will Sawin, Carl Schildkraut and Dmitrii Zhelezov. In this paper, they prove that there exists c>0 and arbitrarily large finite sets A of real numbers such that max(|A A|,|AA|)≤|A|^{2-c}. This disproves the well-known sum-product conjecture over the real numbers. The sum-product conjecture considers the two most basic operations: addition and multiplication. A A is the set of all pairwise sums of two elements in A while AA is the set of all pairwise products of two elements in A. (1/5)
2
94
DEEPMIND CEO DEMIS HASSABIS: TRUE AGI MUST INVENT SPECIAL RELATIVITY FROM 1901 KNOWLEDGE OR CREATE GO FROM SCRATCH “CAN IT INVENT SPECIAL RELATIVITY LIKE EINSTEIN DID IN 1905? … WHAT YOU’D ACTUALLY WANT IS A FUTURE VERSION OF ALPHAGO TO BE ABLE TO INVENT GO”
18
15
87
10,962
6. DeepMind Acquired by Google, DeepMind is known for its breakthroughs in reinforcement learning and deep learning, with notable achievements like AlphaGo and AlphaFold, contributing to advancements in various scientific fields. 🔗deepmind.com
1
1
21
Who believes in building the digital gods in Germany? Due to extreme CapEx and winner takes all dynamics training frontier foundation models or monetizing any 2nd class mode is hard. Every datacenter not planned today will likely be finished after US and China reached a clear ASI level. The scaling laws, the AlphaGo moment the solved Turing test - simply did not wake up enough people here. To make Germans move you need a real substantial shock or you need to build up ideologies for decades well accepted in the intellectual circles. You can ban nuclear energy after Fukushima or implement the EU Green Deal. But regarding AI we are in complete reactive mode - there is little discussion how the society will or should look post 2030. We ramp up our military for a potential conflict with Russia. Politicians complain about skilled workers shortage in the public sector - something which for me seems completely ridiculous and ignorant - politicians don’t guide anymore (our chancellor has 77% disapproval rate, far right AfD is strongest party with 27% in our state media polls), they maintain decay or in the AfD case ride on frustration and fear - we are stuck while the world progresses, China fully caught up with our engineering quality and exceeding us in by far in AI and automobile digital technologies etc. Sooner or later this state must and will be overcome - in worst case it will happen through a war, in less bloody scenarios change comes because the unemployment goes through the roof and everyone sees that we are no longer at the top and some reforms break free - but as soon as life becomes miserable for millions the opportunities for the political extremes open widely. For sure there is an alternative, we could start building our version of an EU-Utopia. What we do now will define where our smartest and most talented founders and entrepreneur will head and if prosperity is build here or elsewhere - looking at my contact lists and seeing all the Germans who already moved or plan to work in the U.S., for AI the gradient clearly is not pulling towards EU. The widespread realization is: If you want to work on the frontier, try to land a job in a U.S. frontier lab - maybe at DeepMind and far behind then Mistral. So - quo vadis 🇩🇪 ? We need either a believable artificial intelligence based post de-industrialization or a robotic re-industrialization perspective. On both fronts we are currently losing, on the AI side we did and still do too little too late and we don’t have cheap Russian nor (less cheap) nuclear energy anymore to melt&weld. It is time for change - even when most are still in denial and the full effects haven’t surfaced - it is unfortunately likely that we will see more economic downturn in Germany. I always hope for the best.
Europe has a lot to lose in the current AI race, and it's worth examining how threats to middle-power sovereignty can result in unsafe outcomes. Such scenarios help illustrate why Europe must invest in AI initiatives that can either leapfrog the current frontier or offer critical components like safety and reliability.
10
4
53
5,195
l'arbitre du Match Hayti vs Ecosse etait un Alphago répondant spécifiquement à une sensibilité algorithmique programmée au profit de l'ecosse. Les droits d'Haïti ont été un fois de plus assassinés par l'épée de la discrimination de Mustapha Ghorbal.
42