Our picks for October’s Papers of the Month are here. Out of 49 shortlisted papers, we spotlight 4 that stand out for their clever ideas on making
#LLMs faster, smarter, and more efficient!
📊 First up, Grouped Lattice Vector Quantisation introduces a novel technique for a fine-grained post-training quantisation of LLMs, retaining good performance even at low bit widths.
🌫️ In Planned Diffusion,
@danielmisrael and colleagues combine autoregressive and diffusion models. While the autoregressive model creates a scaffold and plan, the diffusion model fills the gaps, achieving extremely low-latency text generation.
🤔 Is your LLM overthinking it? Rethinking Thinking addresses the problem of lengthy reasoning chains by bounding their thinking space and gradually distilling their thoughts, speeding up reasoning without losing depth.
🕸️ Finally, When Structure Doesn’t Help compares techniques for how LLMs read text attributed graphs. The results are rather surprising: sometimes, too much structure can hurt.
Check out our summaries 👇