🚨 What’s the best way to select data for fine-tuning LLMs effectively?
📢Introducing ZIP-FIT—a compression-based data selection framework that outperforms leading baselines, achieving up to 85% faster convergence in cross-entropy loss, and selects data up to 65% faster.
🧵1/8