Robert McMenemy 🏴󠁧󠁢󠁳󠁣󠁴󠁿👾

Robert McMenemy 🏴󠁧󠁢󠁳󠁣󠁴󠁿👾

Robert McMenemy 🏴󠁧󠁢󠁳󠁣󠁴󠁿👾@mcmenemy_robert

3 Oct 2024

🚀 Excited to share the second part of my series on deep learning model compression, diving deeper into the optimization of GPT-2 and other large-scale models. Following up on my previous work with hex quantization, this time I explore even more powerful techniques: pruning, Singular Value Decomposition (SVD), Discrete Cosine Transform (DCT), and graph-based compression. 📉 In this article, I discuss how combining these methods not only results in 80.7% reduction in model size but also preserves nearly all of the model’s original performance. This hybrid approach is crucial for: - Edge Device Deployment: Run large models on mobile, IoT, or embedded devices. - Real-Time Applications: Speed up inference for applications like voice assistants or language translation. - Cloud Cost Reduction: Lower compute and storage costs for AI-driven services. - Low-Bandwidth Applications: Deploy models in remote or bandwidth-limited environments. 🔗 Check out the full blog post here rabmcmenemy.medium.com/advan… #DeepLearning #AI #MachineLearning #ModelCompression #EdgeAI #AIOptimization #SVD #DCT #Pruning #GraphCompression #HexQuantization

(PDF) Single-Objective and Multi-Objective Genetic Algorithms for Compression of Biological Networks