Joined March 2018
14 Photos and videos
Amlan Kar retweeted
I am looking for highly motivated students to join me in tackling challenges in spatial generative models and simulators. Please check my website! seung-kim.github.io/seungkim…

📢 Three incoming faculty members at KAIST AI, starting in August 2026✨ Dr. Sehoon Kim from xAI (@sehoonkim418), Dr. Hyunwoo Kim (@hyunw_kim), and Dr. Seung Wook Kim (@seungkim0123), both from NVIDIA, will be joining KAIST AI as Assistant Professors Check their websites below🧵
5
5
142
33,645
Amlan Kar retweeted
There's an interesting procedural generation -> generative AI -> procedural generation ouroboros closing in on itself right now. GenAI is basically data driven procedural generation, and now the LLMs are leveraging perlin noise height maps and fractal tree models.
Fable 5. No external assets. Three.js. dc5fzrbo8ssfx.cloudfront.net…
2
7
90
10,536
Amlan Kar retweeted
World models are moving beyond offline generation towards interactive, real-time experiences. Introducing ⚡FlashDreams⚡: an open-source high-performance inference and serving library built for autoregressive world models: 🔥 Up to 3.10× faster LingBot-World inference 🔥 Up to 2.12× faster Self-Forcing inference 🔥 Up to 1.40× faster Wan2.1 inference 🔥 8 integrated models 🔥 Multi-GPU, streaming, low-latency serving 🔥 Agentic skills that teach you how to use it FlashDreams is designed for a new generation of AI systems that continuously evolve over time while responding to user interactions. It powers applications across robotics, autonomous vehicle simulation, gaming, and virtual worlds. Github: github.com/NVIDIA/flashdream… Docs: nvidia.github.io/flashdreams Research page: research.nvidia.com/labs/sil… Join the #flashdreams Discord channel at discord.gg/yTdHDqFP FlashDreams is also the runtime backbone behind NVIDIA OmniDreams (github.com/nv-tlabs/omni-dre…) 1/n #AI #WorldModels #FastInference #PhysicalAI #OpenSource #NVIDIA
11
78
368
87,625
Amlan Kar retweeted
Following recent World-Action Model results in robotics, the same ~2B OmniDreams single-view backbone can be fine-tuned into a driving policy. In preliminary closed-loop results, it reduces collision from 6.9% to 4.2% when compared with Alpamayo 1.5, while having roughly 5x fewer parameters.
1
5
27
2,588
Amlan Kar retweeted
Real time world model NVIDIA OmniDreams now open sourced! If you are at CVPR, we invite you to also check out a live demo you can try out at the NVIDIA booth.
🚀 What if physical AI policies could interact with generated worlds in real time? Introducing OmniDreams, a generative world model for closed-loop autonomous vehicle simulation. Tech report, code, models, and data samples are available now. Project: research.nvidia.com/labs/sil… Code: github.com/nv-tlabs/omni-dre… Model: huggingface.co/collections/n… Join the #omnidreams discord channel: discord.gg/bsjzh4uZ
2
31
248
36,645
Amlan Kar retweeted
🚀 What if physical AI policies could interact with generated worlds in real time? Introducing OmniDreams, a generative world model for closed-loop autonomous vehicle simulation. Tech report, code, models, and data samples are available now. Project: research.nvidia.com/labs/sil… Code: github.com/nv-tlabs/omni-dre… Model: huggingface.co/collections/n… Join the #omnidreams discord channel: discord.gg/bsjzh4uZ
5
72
261
80,745
Amlan Kar retweeted
It’s been a while since I posted here, but I’m very excited to share what our team at @nvidia has been building over the past year! After a year of active development, we’re getting ready to release SIL-Wheel to the world: a one-stop shop platform for data-centric workflows in large-scale video model training. Built by researchers, for researchers, SIL-Wheel brings together search, curation, annotation, evaluation, and analysis for large video datasets in one centralized framework. Want a sneak peek before the official release? Come by the NeXD26 Workshop @CVPR tomorrow at 10:30!🚀
2
24
64
10,703
Amlan Kar retweeted
The latent-vs-pixel debate misses the point. GPT Image 2 shows what users notice: pixel-level fidelity. Latent models show what scales: compact semantic structure. We connect them by replacing VAE/RAE decoders with a Pixel Diffusion Decoder. Code and Model available: research.nvidia.com/labs/sil… 🧵(1/N)
16
67
411
668,498
Amlan Kar retweeted
A hill that I will die on: with today's AI models, intelligence is a function of inference compute. Comparing models by a single number hasn't made sense since 2024. What matters is intelligence per token or per $. This is especially true when using it in a product like Codex.
The GPT-5.5 model family completely dominates the cost-performance frontier on the Artificial Analysis Index
46
97
1,340
127,977
Amlan Kar retweeted
For a decade, we've made models wider and deeper—but we've barely changed how layers *talk* to each other. Since ResNet's `x F(x)` in 2015, the depth residual has been the only highway for inter-layer communication. It's time to upgrade the staircase. 🧵
18
238
1,938
188,471
Amlan Kar retweeted
Feed-forward 3D reconstruction should not be limited to predicting one Gaussian per pixel. We introduce TokenGS, which uses learnable tokens to decouple the 3D Gaussian prediction from the image resolution and the number of input views. #CVPR2026Highlight [1/6]
6
44
249
45,195
Amlan Kar retweeted
🚀 Excited to share ViPRA: Video Prediction for Robot Actions 📍 Accepted to #ICLR2026 @iclr_conf 🏆 Best Paper — #NeurIPS2025 Embodied World Models Workshop Robot learning today still needs millions of action labeled videos. Yet videos are abundant — from humans and the web — but lack action labels. Meanwhile, pretrained video models already learn rich dynamics. ViPRA is a recipe for turning pretrained video models into robot policies while enabling robot learning to scale with actionless videos. 🧵 Thread ↓
2
40
268
25,537
Amlan Kar retweeted
Special moment to see something I’ve worked on so closely come to life! Today we announce Alpadreams — a world model that lets you explore ♾endlessly♾️in ⚡real time⚡. Video: me (left) and Alpamayo policy (right) driving in Alpadreams at #GTC26. research.nvidia.com/labs/sil…
2
18
97
10,156
Amlan Kar retweeted
A new generation in AV simulation is here! We are announcing AlpaDreams, a real time interactive generative world model for AV simualtion! Just a year ago it took minutes to generate a few seconds of video, today it is real time and interactive! research.nvidia.com/labs/sil…
5
26
106
18,677
Amlan Kar retweeted
I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autorese… Part code, part sci-fi, and a pinch of psychosis :)
1,054
3,627
28,327
11,076,853
Amlan Kar retweeted
Replying to @gkopanas
I love that review. I do genuinely think a great way to evaluate research contributions would be to add the new paper to an agent's context window and see what delta the agent can get on some OSS codebase's performance.
1
14
1,921
Amlan Kar retweeted
🚀 Exciting news! We’re introducing VGG-T³: a scalable model for offline feed-forward 3D reconstruction that finally tackles the "quadratic bottleneck." Ever wanted to have VGGT reconstruct a 1,000-image scene in seconds instead of 10 minutes and use it for visual localization?
7
79
551
84,124
Amlan Kar retweeted
New #NVIDIA Paper We introduce Motive, a motion-centric, gradient-based data attribution method that traces which training videos help or hurt video generation. By isolating temporal dynamics from static appearance, Motive identifies which training videos shape motion in video generation. 🔗 research.nvidia.com/labs/sil… 1/10
11
120
584
110,663
Amlan Kar retweeted
22 Dec 2025
🚗📡Radar is the unsung hero of AV perception: widespread in cars, yet overlooked in simulation. Introducing RadarGen: Realistic radar synthesis from cameras using diffusion. Massive kudos to my fantastic team at @TechnionLive and @NVIDIAAI radargen.github.io/
📢 RadarGen: Automotive Radar Point Cloud Generation from Cameras Can we generate realistic radar point clouds solely from camera images? 🚗📡 We introduce RadarGen, a diffusion-based framework that synthesizes radar returns aligned with visual scenes. radargen.github.io
1
10
38
5,335
Amlan Kar retweeted
15 Dec 2025
Can we apply gradient descent to discrete changes? In our new #SIGGRAPHAsia paper, we show that gradient descent can work on shape grammars, as in CAD and procedural modeling, but only if the grammars are designed correctly!
6
42
262
64,736