Filter
Exclude
Time range
-
Near
Dr. Bijoy Chand Chatterjee, Associate Professor in the Department of Computer Science and Engineering at South Asian University, New Delhi, has been awarded the prestigious Fulbright-Nehru Academic and Professional Excellence Fellowship. Under this fellowship, he will conduct research at @UofCalifornia, Davis, USA. His research, titled “Machine Learning-Driven Resource Optimization for Next-Generation Optical Networks Across Large Geographies: USA and India,” focuses on developing intelligent, scalable optical networking frameworks using machine learning. The work aims to enhance resource allocation, improve network efficiency, and support future communication infrastructures across large-scale deployments. Dr. Chatterjee’s achievement highlights SAU’s growing research excellence and its contributions to cutting-edge advancements in communication systems. #SouthAsianUniversity #FulbrightNehru #MachineLearningResearch #OpticalNetworks #AIInnovation #NextGenNetworks #AcademicExcellence #TechResearchIndia #GlobalResearchCollaboration
4
212
🔥 Meta just released a hard-hitting reality check on scaling LLM training and it’s not the story we’ve been telling ourselves. If you’ve been assuming that “just add more GPUs” is the golden path to faster, cheaper training… this new study from FAIR turns that idea upside down. In this paper, Meta dissects what really happens when you scale LLM training across thousands of accelerators and the findings are surprisingly counter-intuitive: 💡 Key Insights: - Diminishing returns kick in fast. Beyond a certain scale (≈128 H100s), training becomes communication-bound, not compute-bound — meaning GPUs sit idle waiting for parameters to sync. - FSDP isn’t magic at massive scale. Fully Sharded Data Parallelism introduces heavy AllGather / ReduceScatter operations that scale poorly, causing performance slowdowns even as hardware grows. - Model parallelism comes back into the spotlight. Contrary to old assumptions, adding tensor or pipeline parallelism can improve throughput under FSDP by reducing communication groups. - More power consumed, fewer tokens processed. Power draw scales linearly, but throughput doesn’t — meaning energy efficiency drops as the cluster gets bigger. - Hardware progress isn’t solving the bottleneck. H100s offer 3× compute over A100s… but NVLink and interconnect bandwidth haven’t kept up. So communication overhead only gets worse. - Larger models = proportionally larger communication tax. Scaling from 7B → 70B expands compute and communication, shrinking hardware utilization even further. To summarize, this paper is a complete guide on: • Why communication, not compute, is now the real bottleneck • How model parallelism can counteract FSDP overhead • Why training efficiency collapses at massive scale • What future hardware software need to fix • Practical takeaways for anyone building LLM training stacks This study is an important reminder: Scaling isn’t just about FLOPs, it’s about balancing compute, memory, networking, and communication efficiency. If we don’t rethink parallelism strategies now, bigger clusters will only give us smaller returns. #MetaAI #LLMTraining #FSDP #ParallelComputing #AIInfrastructure #DeepLearnin #MachineLearningResearch #AIEngineering #ScalingLLMs #TechInsights
1
1
1
858
Planning with AI usually means brute-force trial and error. But what if models could reason about the world the way humans do? This new paper from Meta FAIR introduces the Vision Language World Model (VLWM) — a foundation model trained on natural videos that predicts future states and actions in language space instead of raw pixels. 👉 Why this matters: VLWM doesn’t just react (System-1), it can also reflect and optimize plans (System-2). For example, given a goal like “cook tomato and eggs”, the model generates step-by-step plans — “Preheat skillet → Whisk eggs → Season → Mix with tomatoes” while internally reasoning about how each action changes the world state. That duality (fast reactive reflective reasoning) helps it set new benchmarks in planning tasks and even beat larger multimodal LLMs in human preference evaluations. #ArtificialIntelligence #AIPlanning #WorldModels #VisionLanguage #ReinforcementLearning #MetaAI #GenerativeAI #MultimodalAI #FutureOfAI #MachineLearningResearch
3
1
7
1,462
🤔 Can agents really learn from mistakes or just repeat patterns? Humans fail → reflect → improve. Agents? Mostly stuck. Today’s methods: ⚙️ Pretraining 🎯 RLHF 🧩 Reward models But all offline. No real-time self-correction. New research (Reflexion, STeP, Agent-R) is changing that. 🚀 👉 Should we prevent mistakes, or build AI that learns from them? 📅 Want to learn more about Agentic AI? Register now for the upcoming Agentic AI Bootcamp happening on 30th Sept & 9th Oct. Don’t miss your chance to build, test, and evaluate intelligent agents! hubs.la/Q03H4kcs0 #AgenticAI #AIresearch #AgenticAIResearch #ReinforcementLearning #RLHF #MachineLearningResearch #AIReflection #AIMemory #AITrends #AI
3
1,237
24 Mar 2019
We are looking for several #DeepLearningResearchers capable of taking the lead on researching and delivering new wide-ranging methods in #MachineLearning . For more information: buff.ly/2YgQGuO #MachineLearningResearch #DeepLearning #AI #ArtificialIntelligence
8
7