Harsh

Harsh

75 Photos and videos

Tweets

Pinned Tweet

Harsh

@HSlifelearner

Mar 26

Check the full suite for full mocap to robotics pretraining . SOMA has anatomically correct joint definitions and has much detailed mesh key points compared to MHR/SMPL. Foundational for all bodypose downstream tasks. More on this soon on its capabilities.

Umar Iqbal

@UmarIqb

Mar 17

#NVIDIA just released a whole ecosystem for human(oid) motion and robot learning from human data. 🚀🦾 Data, as we all know, is the key to scaling AI models. To accelerate the field of Embodied AI, we have open-sourced a full stack of models and tools to capture, generate, retarget, and simulate human(oid) motion data at scale, along with a massive high-quality dataset and a standard human skeletal representation, SOMA, to make them all seamlessly communicate with each other. The entire suite is available under the Apache 2.0 license. 1️⃣ SOMA: A universal interface to unify all parametric human body models (SOMA-shape, SMPL, MHR, etc.) into a standard skeletal representation, eliminating the need for custom adapters or model-specific retargeting. 🔗 lnkd.in/gsxhiJnn 2️⃣ Kimodo: High-fidelity, controllable text-to-motion generation for both humans and humanoid robots. 🔗 lnkd.in/gCc84XnX 3️⃣ GEM: A global human pose estimation method from in-the-wild videos, natively compatible with SOMA. 🔗 lnkd.in/g_QAvRjn 4️⃣ Bones-SEED: A massive dataset of 150k motions in SOMA format, including data already retargeted for the Unitree G1, created with our partners at Bones Studio. 🔗 lnkd.in/gfx-QD-w 🔗 lnkd.in/gyNdTwQx 5️⃣ SOMA Retargeter: A dedicated tool for seamless motion retargeting from the SOMA skeleton to the Unitree G1. 🔗 lnkd.in/gqz9Na-H 6️⃣ ProtoMotions: Our high-performance simulation framework for training digital human(oid)s via RL, now with native SOMA support. 🔗 lnkd.in/gmvMikMU This is just the beginning, and we have much more in the pipeline. Excited to see what the community builds next! #NVIDIA #GTC #GTC2026 #Robotics #EmbodiedAI #PhysicalAI @NVIDIAAI

0:04

1,025

Elon Musk

Harsh retweeted

Elon Musk

@elonmusk

Jun 13

Replying to @PalmerLuckey @paranoidream

That is what they told me

465

474

15,268

247,779

Elon Musk

Harsh retweeted

Elon Musk

@elonmusk

Jun 12

Looking forward to taking our exciting partnership with Nvidia to the next-level

NVIDIA

@nvidia

Jun 12

Huge congratulations to the @SpaceX team on a historic IPO debut. Fueling the next frontier of space and AI. 🌌 NVIDIA's partnership with SpaceX spans nearly a decade, from hand-delivering the world's first #NVIDIADGX-1 supercomputer in 2016 to the custom DGX Spark handoff at Starbase. Together, we've been pushing the boundaries of accelerated computing to help power the future of space exploration.

0:07

6,375

23,354

276,946

36,852,522

Claude

Harsh retweeted

Claude

@claudeai

Jun 9

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

0:20

4,989

14,521

104,634

55,697,181

Ruilong Li

Harsh retweeted

Ruilong Li

@ruilong_li

Jun 9

Actually, both of these two NVIDIA live demos at CVPR are powered by flashdreams!

0:27

0:46

Ruilong Li

@ruilong_li

Jun 3

World models are moving beyond offline generation towards interactive, real-time experiences. Introducing ⚡FlashDreams⚡: an open-source high-performance inference and serving library built for autoregressive world models: 🔥 Up to 3.10× faster LingBot-World inference 🔥 Up to 2.12× faster Self-Forcing inference 🔥 Up to 1.40× faster Wan2.1 inference 🔥 8 integrated models 🔥 Multi-GPU, streaming, low-latency serving 🔥 Agentic skills that teach you how to use it FlashDreams is designed for a new generation of AI systems that continuously evolve over time while responding to user interactions. It powers applications across robotics, autonomous vehicle simulation, gaming, and virtual worlds. Github: github.com/NVIDIA/flashdream… Docs: nvidia.github.io/flashdreams Research page: research.nvidia.com/labs/sil… Join the #flashdreams Discord channel at discord.gg/yTdHDqFP FlashDreams is also the runtime backbone behind NVIDIA OmniDreams (github.com/nv-tlabs/omni-dre…) 1/n #AI #WorldModels #FastInference #PhysicalAI #OpenSource #NVIDIA

0:47

113

19,639

NVIDIA AI Infrastructure

Harsh retweeted

NVIDIA AI Infrastructure

@NVIDIAAIInfra

Jun 7

📣 @SKhynix and @NVIDIA announce a multiyear technology partnership to codevelop next-generation memory for the global AI factory buildout. SK hynix will codevelop memory for NVIDIA's platforms — from NVIDIA Vera Rubin to Jetson Thor — while advancing fab digital twins using @NVIDIAOmniverse libraries and applying NVIDIA CUDA-X and PhysicsNeMo to accelerate semiconductor design and manufacturing. Read the press release: nvda.ws/4e43e0p

239

1,949

259,519

Harsh

Harsh

@HSlifelearner

Jun 8

How it started vs ended You can easily see who plays on a pixel screen vs otb

The San Francisco Standard

Harsh retweeted

The San Francisco Standard

@sfstandard

Jun 5

After going undefeated Tuesday, the mayor has an unequivocal mandate — and the makings of a new political machine. 📝: Adam Lashinsky sfstandard.com/opinion/2026/…

Opinion: Daniel Lurie for governor? Election wins open the door to the discussion

After going undefeated Tuesday, the mayor has an unequivocal mandate — and the makings of a new political machine.

sfstandard.com

98,377

0xSero

Harsh retweeted

0xSero

@0xSero

Jun 7

American Open Source is so back. 9 / 30 of the models on page 1 of Huggingface are published by Nvidia.

452

102,993

NVIDIA

Harsh retweeted

NVIDIA

@nvidia

Jun 3

This week at #CVPR2026, NVIDIA Research is presenting three papers across physical ai that offer groundbreaking solutions for training at scale across diverse applications: → GraspGen-X: the first foundation model for zero-shot grasping, trained on billions of simulated grasps → LCDrive: a model that replaces expensive text-based reasoning with compact latent representations → NitroGen: a generalized gameplay AI foundation model that harnesses NVIDIA Isaac GR00T to help train embodied agents Learn more: nvda.ws/4ubwjgk

0:12

267

42,745

Jim Fan

Harsh retweeted

Jim Fan

@DrJimFan

Jun 5

NitroGen just won CVPR Best Paper Honorable Mention!! We are making strides towards general-purpose embodied agents that master not only the real world physics, but also all possible physics across a multiverse of simulations. It’s been 4 years since MineDojo, our first embodied agent in Minecraft, won NeurIPS Best Paper. Congrats to everyone on the team!!

382

37,269

Umar Iqbal

Harsh retweeted

Umar Iqbal

@UmarIqb

Jun 4

GRAIL addresses the holy grail of robotics. Humanoid-Object Interaction Data! Releasing a large-scale humanoid-object interaction data (22k motions), code to generate more, and all the models. #NVIDIA #HumanoidRobotics #EmbodiedAI

Ye Yuan

@_ye_yuan

Jun 4

Humanoid robotics is hitting a data wall. Teleop and mocap took us far, but they don’t scale to every object, terrain, and behavior. We’re releasing GRAIL: research.nvidia.com/labs/dai… — a fully digital pipeline for generating loco-manipulation data before the robot moves. 🧵(1/8)

0:48

3,866

Bernt Bornich

Harsh retweeted

Bernt Bornich

@BerntBornich

Jun 4

We’re going all in on World Models. Today we’re launching the 1X World Model Lab. The bet is simple: You can’t fine-tune your way to AGI. And you definitely can’t fine-tune your way to robots that can operate in the physical world. General-purpose humanoids need models that understand space, motion, objects, causality, affordances, physics, and action before they ever see a specific task. The frontier is not better VLA wrappers. The frontier is embodied world models. The 1X World Model Lab will focus on large-scale embodied world model pretraining: building the most generalizable foundation model for humanoid robots from the ground up. The next frontier in AI requires scaling: web-scale media egocentric human videos sim dexterous remote operated robot data on-policy NEO data → real-world deployment for robot data collection and RL → abundance of data → physical AI The robot collects data. The model gets better. The robot gets better. Repeat. To lead this, we brought in one of the best for the mission: @_sam_sinha_ , as Head of World Models. Sam was a founding research scientist at Luma AI and has been at the frontier of scaling multimodal generative video models his whole career. If you’re the best in the world at large-scale pretraining, video models, robotics, RL, infra, or data — and you want your models to move atoms, not just pixels — join us. Send background evidence of exceptional ability to: wmlab@1x.tech We’re building the model that makes autonomous labor real.

127

214

2,552

352,093

Pika

Harsh retweeted

Pika

@pika_labs

May 28

Attention, all you geniuses with products, but no marketing skills. Today we’re launching the Founder Starter Kit—4 skills that will help you look and sound like a legit company, including: > Build-a-Brand > App Screens > Product Sizzle > Founder Video Available for Claude via the Pika MCP.

1:23

120

164

2,006

1,520,457

Min-Hung (Steve) Chen

Harsh retweeted

Min-Hung (Steve) Chen

@CMHungSteven

Jun 2

🚀 4D-RGPT is a #CVPR2026 Highlight from @NVIDIA! 🌌 Amid #Cosmos3 #PhysicalAI momentum, we tackle: 🎥 region-level 4D video understanding 🎯 regions 📏 depth 🌀 motion ⏱️ time 🖼️ Main poster 5 workshops in Denver 📍Jun 7, 11:45–1:45, ExHall F #225 📦 Code, Model weights & R4D-Bench are out 👇 @CVPR @NVIDIAAI

0:10

3,664

Luke Metro

Harsh retweeted

Luke Metro

@luke_metro

Jun 1

From a young age, I have always wanted to be the exit liquidity for shareholders of artificial intelligence companies

123

1,510

19,466

468,929

Tenobrus (→vibecamp)

Harsh retweeted

Tenobrus (→vibecamp)

@tenobrus

Jun 1

buying into the anthropic IPO at $1T valuation would obviously be an incredible deal, 22x multiple on ARR, huge room to grow, countless markets untapped, mythos as of yet unmonetized. kind of thing people dump whole retirement portfolios into. which is why it'll be $3T

Anthropic

@AnthropicAI

Jun 1

Anthropic has confidentially submitted a draft S-1 registration statement to the Securities and Exchange Commission. Pending completion of SEC review, this gives us the option to pursue an initial public offering. Read more: anthropic.com/news/confident…

1,164

127,337

Xuning Yang

Harsh retweeted

Xuning Yang @xuningy

Jun 1

🎉 We added 2 SOTA WAMs to the RoboLab Leaderboard 🎉 Current leaders on RoboLab-120 (specific instr.): 🥇Cosmos3-Nano-Policy (39.7%) 🥈π0.5 (28.1%) 🥉DreamZero (28.1%) → See full results at: research.nvidia.com/labs/srl… → All policy clients available at: github.com/NVlabs/RoboLab/

127

30,572

Ashkan Mirzaei

Harsh retweeted

Ashkan Mirzaei @ashmrz10

Jun 1

I’m excited to share what our team has been building at @NVIDIAAI since I joined: Cosmos 3, an omnimodal world model for Physical AI. Project: research.nvidia.com/labs/cos… HF: huggingface.co/collections/n… Code: github.com/NVIDIA/cosmos

0:19

158

12,246

Ming-Yu Liu

Harsh retweeted

Ming-Yu Liu

@liu_mingyu

Jun 1

Introducing NVIDIA Cosmos 3 We released NVIDIA Cosmos 3 last night. And today, seeing it take the top spots across 8 open model leaderboards feels surreal. We spent months working towards this moment. Here’s the breakdown: The Leaderboard Wins World Reasoning 🏆 #1 open model on VANTAGE-Bench for vision AI 🏆 #1 overall on Traffic Anomaly Reasoning (TAR) World Generation 🏆 #1 open model on Artificial Analysis Image-to-Video leaderboard 🏆 #1 open model on Artificial Analysis Text-to-Image leaderboard 🏆 #1 open model on PAI-Bench for physical AI synthetic data generation 🏆 #1 open model on Physics-IQ, which measures accuracy on physical laws 🏆 #1 open model on R-Bench for world generation quality World Action 🏆 #1 on RoboArena for specialized policy 🏆 #1 on RoboLab for action generation But the leaderboards are only part of the story. The real story is why we built Cosmos 3 in the first place. The Problem Training robots and autonomous systems in the real world is painfully hard. Robots need to try the same thing numerous times before they succeed reliably. Self-driving cars need rare edge cases that may never happen naturally. Smart machines need to understand physics, motion, contact, failure, and surprise. And real-world data is slow, expensive, and sometimes dangerous to collect. At some point, the answer cannot just be “collect more data.” You can’t collect your way out of an infinite physical world. You have to generate it. That… was the question behind Cosmos: Can one model understand the physical world deeply enough to reason about it, simulate it, and generate actions inside it? What We Built Cosmos 3 is the first omni-model for physical AI. It can understand and generate across: language · images · video · audio · action sequences It is not just a VLM. Not just a video generator. Not just a robot policy model. It is all of them, in one single model. That matters because physical AI has been fragmented for a long time. Cosmos 3 is our attempt to collapse that fragmentation. Depending on how you configure the inputs and outputs, the same model can act as a vision-language model, a video/world generator, a world simulator, or a world-action model. No separate architecture required. The Architecture Under the hood, Cosmos 3 uses a dual-tower Mixture-of-Transformers architecture. One tower is autoregressive for reasoning. It handles next-token prediction for language and discrete understanding. The other tower is diffusion-based- for generation. It denoises images, video, audio, and action trajectories. Two towers. Dual-stream joint attention. One shared world representation. Each modality gets its own tools: visual encoders, video VAEs, audio VAEs, and action projectors that can map different embodiments into a unified action space. Action is a first-class modality in Cosmos 3. That’s what makes it more than a video model. It doesn’t just predict and generate what the world might look like. It can connect reasoning and world modeling to physically grounded action. Why This Matters One of the most interesting findings from the ablation work is that training action domains together creates positive transfer. That means adding more embodiments does not just add more use cases. It can actually make the model better. This is the heart of why omnimodal training matters. A shared world representation is not just convenient. It can make each individual task stronger. That’s the part that feels like the beginning of something much bigger. The part I’m most excited about is that Cosmos 3 is fully open. Developers get the models, scripts, optimization, inference endpoints, post-training recipes, datasets, and benchmarks. Everything is available under the Linux Foundation’s OpenMDW 1.1 License. You can use Cosmos 3 out of the box. You can use the VLM, world model, or world-action pieces separately. You can post-train it for your own domain, embodiment, or accuracy target. That’s what makes this feel different. Cosmos 3 is not just a model release. It is the foundation for building intelligence for autonomous machines. For me, Cosmos 3 feels like a step toward a world where physical AI development becomes much more scalable and accessible - to a new age of developers and agents. That’s what we built Cosmos 3 for. I cannot wait to see what you build with it. Download Models on Hugging Face huggingface.co/collections/n… Customize Models on GitHub github.com/NVIDIA/cosmos Read the Tech Blog to Learn More developer.nvidia.com/blog/de…

450

65,049

NVIDIA GeForce

Harsh retweeted

NVIDIA GeForce

@NVIDIAGeForce

Jun 1

It all starts with the @NVIDIARTXSpark Superchip. RTX Spark reinvents the personal computer for agents, creating and gaming. Learn more → nvidia.com/en-us/products/rt…

1:46

136

1,150

245,166