Can we just... give the AI a calculator?
Like "I put the two numbers into a calculator and it says the answer is this. I then thought about it real hard, my AI eyes rolled back into my AI head, and I magic-ed up a number based on language vibes and here's that number too"
Is OpenAI's o1 a good calculator? We tested it on up to 20x20 multiplication—o1 solves up to 9x9 multiplication with decent accuracy, while gpt-4o struggles beyond 4x4. For context, this task is solvable by a small LM using implicit CoT with stepwise internalization. 1/4