We develop perception, control, & planning algorithms for robot autonomy | @CMU_Robotics | instagram.com/airlabcmu | youtube.com/airlab

Joined June 2014
167 Photos and videos
Proud to see Co-Me accepted to CVPR 2026 🎉 Now supporting MapAnything 1.1, Depth Anything 3, and Pi3 - and 2× faster than the original, up to 21.5× speedup on long VGGT sequences. Congrats to the team!
All your favorite 3D models — now faster with Co-Me. 🎉 Accepted to CVPR 2026, Co-Me now supports more 3D foundation models: MapAnything 1.1, Depth Anything 3, and Pi3. Same simple confidence-guided token merging idea — now accelerating even more 3D reasoning models. 👇
4
12
3,146
AirLab retweeted
I’m excited to share that I successfully defended my Ph.D. thesis on Specification-Driven Planning for Safe Autonomy! I’m deeply grateful to my committee Sebastian Scherer, Changliu Liu, Karen Leung and Eunsuk Kang for their time and guidance throughout this journey. More in 🧵
5
1
39
6,104
AirLab retweeted
Meet KinDER — a stress test for robot physical reasoning. All 13 methods failed 😈 🌎 25 environments ♾️ Infinite tasks 🏋️ Gymnasium API ⚒️ Over 20 parameterized skills 🪧 Human demonstrations 📊 13 baselines (planning and learning) From @Princeton @CMU_Robotics @ICatGT @CambridgeMLG @nvidia @MIT_CSAIL 🧵 1/n
1
25
131
35,383
AirLab retweeted
I'm excited to share that RAVEN was accepted to ICRA 2026! Paper: arxiv.org/abs/2509.23563 Website: raven-semantic.github.io Collaboration with @OmarAlama, Dmytro Kurdydyk, John Keller, @Nik__V__ , Wenshan Wang, @ybisk , @smash0190 See you in Vienna!
We introduce RAVEN, a 3D open-set memory-based behavior tree framework for aerial outdoor semantic navigation. RAVEN not only navigates reliably toward detected targets, but also performs long-range semantic reasoning and LVLM-guided informed search
1
8
27
3,144
AirLab retweeted
#IROS2026 will convene in Pittsburgh from Sept 27 – Oct 1! As one of the largest & most dynamic robotics conferences, IROS brings together world-leading researchers, educators, govt. leaders, startups, industry innovators, practitioners, & investors👇 2026.ieee-iros.org/
15
97
7,783
Fast & light 2D and 3D zero-shot open-vocabulary semantic segmentation is here 🚀🪶!! Meet RADSeg: - 6-30% mIoU improvement while being 3.95x faster and using 2.5x fewer parameters. - Outperforms combinations of huge vision models (850-1350M) with just 105M ! 💡The key is building on the agglomerative model, RADIO, and improving spatial consistency.
6
22
150
21,547
AirLab retweeted
31 Dec 2025
We are honored to share that Super Odometry is now published in @ScienceRobotics and featured as a highlight article! 🚀 This work rethinks the SLAM paradigm: true resilience should not rely solely on external perception—it should begin from within. science.org/stoken/author-to… #SLAM
8
5
28
4,296
AirLab retweeted
Human peripheral vision reduces detail in out-of-focus areas. This “annoying” feature saves massive computation while preserving spatial cues. And for the most human-like artifact we build—ROBOT—that efficiency matters. Checkout our recent work:👉🔗 co-me-tokens.github.io
2
6
472
AirLab retweeted
🧵[3/n] Co-Me distills a tiny confidence predictor that identifies low-confidence regions before most layers even run, letting us merge those tokens and cut redundant compute. ✨That’s it — simple and effective.
1
1
8
1,429
AirLab retweeted
🧵[2/n] We noticed the model burns most of its compute on uncertain regions that are later discarded by downstream tasks. Can we avoid wasting this computation?
1
2
8
824
AirLab retweeted
More and more visual-geometric transformers are coming out, like VGGT and MapAnything—but pushing them to real robot is still challenging. What if we could make them 10× faster? 👉🔗co-me-tokens.github.io ⚡Co-Me speeds up VGGT and MapAnything by up to 11.3x and 7.2x. How? 👇🧵
1
20
97
16,366
AirLab retweeted
Robots can plan, but rarely improvise. How do we move beyond pick-and-place to multi-object, improvisational manipulation without giving up completeness guarantees? We introduce Shortcut Learning for Abstract Planning (SLAP), a new method that uses reinforcement learning (RL) to discover shortcuts in the planning graphs induced by task and motion planning (TAMP) skill libraries. It is a plug-and-play module that can be trained on top of existing planners to speed up execution through learned shortcuts. (1/5)
1
22
70
19,899
AirLab retweeted
⛔️Stop throwing away far range semantics, encode them as Rays instead ! 🔥Excited to present RayFronts at #IROS2025 in Hangzhou, China ! 🎥Catch us in the live presentation next Tuesday 16:45-16:50 Track 9.
Want to push the online 🌎 understanding & search capabilities of robots? Introducing RayFronts 🌟→ 💡 Semantics within & beyond depth sensing 🏃‍♂️ Online & real-time mapping 🔍 Querying with images & text ⚙️ Operating in any environment rayfronts.github.io The trick →🧵👇
2
8
1,054
AirLab retweeted
Last year, I came across the idea of constrained decoding (I know, late to the party) and was fascinated. The ability to enforce constraints for LLMs at inference time without fine tuning is a powerful idea. It got me thinking, can we do this for robot foundation models? 1/n🧵
1
7
33
5,305
AirLab retweeted
We introduce RAVEN, a 3D open-set memory-based behavior tree framework for aerial outdoor semantic navigation. RAVEN not only navigates reliably toward detected targets, but also performs long-range semantic reasoning and LVLM-guided informed search
1
8
25
5,541
AirLab retweeted
17 Sep 2025
Meet MapAnything – a transformer that directly regresses factored metric 3D scene geometry (from images, calibration, poses, or depth) in an end-to-end way. No pipelines, no extra stages. Just 3D geometry & cameras, straight from any type of input, delivering new state-of-the-art results 🚀 One universal model enables SoTA for: 🔥 Mono Depth Estimation 🔥 Multi-View SfM 🔥 Multi-View Stereo 🔥 Depth Completion 🔥 Registration … and many more possibilities! – plus everything is metric 🎯 We release code for data processing, training, benchmarking & ablations – everything Apache 2.0! Details & Links 👇
30
132
738
122,524
AirLab retweeted
🚨CMU Vision-Language-Autonomy update: The team released a video to "find the refrigerator in the lounge"–– they are looking for new PhD & Master's students to work on long-horizon navigation & instruction! Contact Ji Zhang for more information: bit.ly/3Kgvm5a
5
27
172
20,362
AirLab retweeted
Thrilled that @NVIDIA_Robotics selected us among the first to test the new NV platform! 🙌 Huge thanks to NVIDIA and Jensen Huang for the generous gift of a #JetsonThor Dev Kit to @CMUAirLab. We’ve already run #MACVO on Thor at high resolution while keeping real-time performance
3
5
46
3,339
AirLab retweeted
21 Jun 2025
Want to learn how to empower 🤖 with real-time scene understanding and exploration capabilities? Catch Me, @hocherie1 & @QiuYuhengQiu presenting RayFronts at #RSS2025 SemRob Workshop (OHE 122) & Epstein Plaza at 10:00 am PST Today! x.com/OmarAlama/status/19101…
4
14
1,088
AirLab retweeted
20 Jun 2025
Catch our team @Parvkpr @PatrikarJay @AirLabCMU presenting and demoing ViSafe at #RSS2025 tomorrow! We'll be showing our payload demo & high speed aerial collision avoidance results 🚀

2
10
1,442