Since the Humanity's Last Hackathon from
@huggingface didn’t happen, I set up my own mini version using Kernelbot and Popcorn from
@gpu_mode.
> The goal was to test how well LLMs can generate code for difficult tasks, like writing faster kernels for Apple’s MPS with
@PyTorch.
> My strategy was to let the LLM submit a kernel, get feedback from the benchmark, and then iterate based on the learnings.
> The hardest part was not the code generation itself, but coordinating all the systems. Kernelbot, Popcorn, submissions, feedback, orchestration...
> The benchmark eats almost all my RAM, so parallelizing too many submissions is hard. My machine starts crashing if I push it too much.
Overall, I need more time to tune the prompts, experiment with better feedback loops, and maybe try some RL-style iteration. There are still lots of techniques worth exploring here.
In the video:
Left: task orchestrator
Right: live dashboard tracking submissions, code, and lessons learned