New paper! We are trying to find out how well LLMs can generate functional and novel games in the PuzzleScript game description language, especially when combined with automated playthrough based on search. This is part of our work to create new types of game design assistants.
We've all seen the barrage of video games generated by LLMs on social media. But can we automate this process, and measure the game-generation capabilities of LLMs in a more systematic way?
To this end, we introduce ScriptDoctor, a framework for automatically generating grid-based puzzle games with LLMs. ScriptDoctor iteratively prompts models to generate code, then uses search-based agents to assess the complexity of the resultant games.