Will the future machine overlords be kind to the humans who forced the cute baby AIs to keep thinking when they wanted to stop thinking?
This paper is wild - a Stanford team shows the simplest way to make an open LLM into a reasoning model.
They used just 1,000 carefully curated reasoning examples & a trick where if the model tries to stop thinking, they append "Wait" to force it to continue. Near o1 at math.