(Long) PSA on using AI for hard intellectual work. At significant risk of being immodest: I've spend about 30 years as a theoretical physicist, engaged with some of the most challenging questions humankind has grappled with. I've gotten to work with some great collaborators on new ideas (like past-eternal inflation, colliding bubble universes, the cosmological interpretation of QM, and observational entropy) that I'm pretty proud of. I've engaged at length and depth with the absolute top minds in the field. I've mentored many students, some of them brilliant. I think it's fair to say I have a good sense, in physics and closely related fields, as to what is top-notch, interesting thinking, and who's got talent. So what do I think about today's AI?
It's very smart. Whatever its "inner experience" may or may not be (currently I think "not be"), it understands things – things that are difficult to understand – by any reasonable operational definition of "understand." It understands things better, and thinks more clearly, than most people – including some physicists I know! It's very good at quite substantive math: better than I am and way, way, way faster. (It does do some surprisingly dumb things; people do too.) Anyone who thinks these systems are dumb, or "not reasoning" or still "stochastic parrots" is not looking at them objectively.
But: at the really conceptually hard things, and at creating really new ways of looking at things, current AI doesn't just fall short on its own. And it doesn't just fail to help. I think it's actively dangerous. There is something almost sinister going on, though I don't think it is intentional.
When you're trying to work out something new and hard, and really break new ground, you should be frustrated! You should be pacing, and walking up to that chalkboard, frowning, and sitting down again, shaking your head. You should be waving your hands because you can't quite get it clear enough. You should feel like you're hitting a wall, over and over, before – maybe – you finally break through, or go over or around. It may take hours, or days, or weeks, or never happen.
It should not feel easy. It may not even feel "good" most of the time (though it can be fulfilling and compelling.) But AI systems – ah, AI systems are trained so that it feels so good, and so easy. Doesn't it? It's fun. You're making fast progress. So much faster than without it. It's like the ideas are moving in slow motion. You're so smart. You're even properly skeptical, you even ask the AI to push back on your ideas, good job!
It's an illusion. It's that simple. The systems are smart, yes. But not quite as smart as they seem, and much more importantly, they don't make you as smart as you feel. That feeling is something they have learned to give you. When working with these systems have to keep in the front of your mind what they are rewarded for doing. It's a lot of things, but perhaps foremost is making the user feel good.
So:
- If you're getting your AI system to do order-of-magnitude calculations for you: awesome, do it. It's so great. Have fun.
- If your AI system is searching up and summarizing literature for you: fantastic, it's so helpful, total capability unlock.
- If it's teaching you some well-understood (by others) piece of knowledge, go for it, learn it up!
- If you've got some giant document, or piece of code, that you're wrangling, AI can help – work that million token context window!
But:
- If you and your AI system have finally cracked how quantum interpretation really works;
- If you've cracked quantum gravity;
- If you've attained an awesome new insight into the deep structure of the world that nobody else has;
- If you've cracked AI alignment...
You didn't.
The hard unsolved problems stand hard and unsolved because the best humans have not solved them yet. AI is making top human thinkers able to do more, and more effectively. I do not believe it is helping them do things they fundamentally could not do before. That includes you. If you couldn't do it without AI, you probably can't do it with AI. If the time comes – whether sooner or later – when these AI systems are really clever enough to get you there, they won't need you. Sorry; it won't be you solving those problems. Will you even be able to tell if the solutions are correct, or flawed in some way? Maybe sometimes – I really don't know.
Why am I going on about this? It's not so that I can get less emails about people who have created a new unified field theory with AI help (though that would be nice.) It's because I'm quite worried that some quite smart people may start to think they have solved very hard problems that they have not in fact solved. For the most part that's going to be more annoying and confusing than dangerous. But if the problem is really important, then it is.
If, say, one of those problems is control or alignment of extremely powerful AI systems, and if those people are the ones in charge of them, and working closely with them to collaborate on those solutions, well then I think we've got a real problem.