Joined August 2024
44 Photos and videos
Pinned Tweet
Jun 3
The Rising Star Award has been announced! Congratulations to Yining Hong @yining_hong , Rising Star Awardee for Spatial Intelligence, and to finalists Zhiyang Dou, Jiafei Duan @DJiafei , Hezhen Hu, Tiange Xiang @xxtiange , and Junyi Zhang @junyi42 . The awards are supported by 2077AI, with up to USD 30,000 in research gift funding to the awardee's institution and USD 2,000 in API credits for each finalist, helping early-career researchers further develop promising ideas in spatial intelligence. Full Rising Star list: e2e3d.github.io/rising_star.… Join the E2E3D Workshop today: 13:00–18:00 · Room 501 E2E3D Workshop: e2e3d.github.io/index.html

1
14
28
14,761
Jun 6
Come meet 2077AI at #CVPR2026 Booth #716. Stop by to talk about benchmarks, multimodal data, evaluation, and open research collaboration. Researchers, engineers, students, and friends are all welcome. @ZihanWang123 @Rubyzx67 @H7803954325458 @Neutrino_l
3
8
327
Jun 5
Wrapped up the DataMFM Workshop at #CVPR2026! The workshop brought together discussions on the data foundations of multimodal AI, aiming to help define the foundation of next-generation multimodal data ecosystems. Congratulations to the DataMFM Challenge winners! Many thanks to our speakers @RanjayKrishna @liuziwei7 @du_yilun @aagrawalAA, workshop organizers Pengyuan Li from @MITIBMLab @ZexueHe @ZihanWang123 @Rubyzx67 @WenhuChen @ManlingLi_ @RogerioFeris, the challenge organizers @aa_blueshark @HanSineng @H7803954325458, and everyone who joined us! DataMFM Workshop: datamfm.github.io/ #CVPR2026 #DataMFM #2077AI
5
5
377
Jun 3
Great to see this level of engagement at the E2E3D Workshop @ CVPR in Denver! The room is at full capacity, with many attendees standing and many more gathered outside the room. Huge thanks to our speakers @lucacarlone1 @jiajunwu_cs @pesarlin @geopavlakos @drmapavone and all the organizers @zhiwen_fan_ @QianqianWang5 @CongWenyan0320 @ManlingLi_ @_vztu @Neutrino_l @ZihanWang123. This workshop is sponsored by 2077AI, and we remain committed to advancing frontier research in spatial intelligence and embodied AI. Workshop: e2e3d.github.io/index.html #CVPR2026 #2077AI
7
14
1,025
Jun 1
Current VLM pipelines have a major blind spot in long-horizon household tasks: noisy auto-labels and weak spatial grounding. These errors compound fast, leading to object hallucinations, skipped steps, weak physical reasoning, and loss of task intent over time. Our #CVPR2026 paper, EgoTL, addresses this data bottleneck: Paper: arxiv.org/abs/2604.09535
1
4
414
Jun 1
EgoTL-Bench evaluates 6 egocentric capabilities across 3 reasoning layers: memory-conditioned planning scene-aware action reasoning next-action prediction action recognition direction recognition distance estimation It also evaluates world models on 60-second egocentric rollouts.
1
2
104
Jun 1
Results: EgoTL improves both VLM reasoning and world-model rollouts. With LoRA fine-tuning on ~1.2k curated Q&A pairs, Qwen2.5-VL-7B surpasses the strongest tested baselines across the reported EgoTL-Bench metrics. Distance estimation is the clearest gain: 39.45% MRA, nearly 2x the best pre-finetuning setup. Fine-tuned COSMOS-Predict2 also improves long-horizon rollouts, better following think-aloud instructions while preserving object identity and scene layout. Project: ego-tl.github.io/ Dataset:huggingface.co/datasets/luuu…
2
70
May 31
ChartNet, our paper developed with @IBM and @MIT has been accepted at #CVPR2026. It introduces 1.5M multimodal chart samples across 24 chart types and 6 plotting libraries. Core question: how do we train models to connect a chart image with the numbers, code, text, and reasoning behind it?🧵
1
1
8
163
May 31
On public benchmarks, Granite-Vision-2B ChartNet improves: ChartCap: 1.6 → 12.4 BLEU ChartMimic-v2: 30.84 → 58.42 Covering chart summarization and chart-to-code translation.
1
1
73