Engineer @ NVIDIA

Joined September 2016
12 Photos and videos
We released Nemotron 3 Ultra! It's quite strong across the board, but is especially good at math reasoning tasks compared to other open models. With enough TTC Ultra does really well on hard math competitions like IMO, USAMO and Putnam. And as always, we aren't only releasing the model, but all datasets and recipes that make it this strong. Nemotron-SFT-Math-v4: 550K solutions to "final answer" math problems with and without Python tool use huggingface.co/datasets/nvid… Nemotron-Math-Proofs-v2: 80K solutions and verifications for hard mathematical proofs huggingface.co/datasets/nvid… Nemotron-RL-Math-v2: 4K selected math problems with verified final answers. This dataset is much higher quality than any previous math RL set we released huggingface.co/datasets/nvid… If you want to learn more details, check out our tech report research.nvidia.com/labs/nem…
NVIDIA Nemotron 3 Ultra is now live! Frontier accuracy, 5X greater speed, 30% lower cost. Deploy however you need - on-premise, on the cloud, or at the edge. Model is live on HuggingFace under the OpenMDW 1.1 license. youtube.com/watch?v=D8LIIvQV…
5
9
69
9,408
somehow I always feel bad when I interrupt codex..
1
3
233
Igor Gitman retweeted
Nemotron 3 Nano and Super from @nvidia are now available on Tinker! We're offering a limited-time GTC 50% discount for both. The Nemotron family features open hybrid MoE models optimized for compute efficiency for agentic applications.
4
16
115
17,790
Igor Gitman retweeted
⚡ NVIDIA Nemotron 3 Super is live on OpenRouter! 120B params, 12B active. Hybrid Mamba-Transformer MoE with highest throughput efficiency in its class. 1M context. Fully open weights, data, and recipes. Built for multi-agent systems that need to think fast.
12
22
381
27,457
Nemotron 3 Super is out! It's really good and it will only get better from here. And we release all the details - tech report, training code, training data, model weights. Everything you need to build a model like this yourself!
Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nem… And yes, Ultra is coming!
1
18
687
Igor Gitman retweeted
Mar 11

65
260
1,583
415,195
Igor Gitman retweeted
Nemotron 3 Super is here — 120B total / 12B active, Hybrid SSM Latent MoE, designed for Blackwell. Truly open: permissive license, open data, open training infra. See analysis on @ArtificialAnlys Details in thread 🧵below:
10
45
278
30,017
Nemotron Super (120B, 10B active, ≈"3B" class speed) destroys all Qwens, gpt-oss, matches Kimi K2.5, exceeds Step 3.5 Flash and V3.2, and to my knowledge is only beaten by two open models (309B MiMo and Speciale) on the most interesting benchmark today. @htihle pls test
Well, seems we're not getting DeepSeek V4 today but we're getting what amounts to its lite version runnable on normal hardware. New architecture, fast, 1M context… …and it's a bit weaker than the equivalent Qwen 3.5.
10
11
193
21,558
Igor Gitman retweeted
At Nvidia we're really into synthetic data and how to make better agents, faster. I was really excited about this technique from @DongfuJiang , @zhuofengli96475 and team and I wanted to reproduce their SDG in Data Designer with our new release. Check it out!
🚀 Introducing OpenResearcher: a fully offline pipeline for synthesizing 100 turn deep-research trajectories—no search/scrape APIs, no rate limits, no nondeterminism. 💡 We use GPT-OSS-120B a local retriever a 10T-token corpus to generate long-horizon tool-use traces (search → open → find) that look like real browsing, but are free reproducible. 📈 The payoff: SFT on these trajectories turns Nemotron-3-Nano-30B-A3B from 20.8% → 54.8% accuracy on BrowseComp-Plus ( 34.0). 🧩 What makes it work? 🔎 Offline corpus = 15M FineWeb docs 10K “gold” passages (bootstrapped once) 🧰 Explicit browsing primitives = better evidence-finding than “retrieve-and-read” 🎯 Reject sampling = keep only successful long-horizon traces 🧵 And we’re releasing everything: ✅ code search engine corpus recipe ✅ 96K-ish trajectories eval logs ✅ trained models live demo 👨‍💻 GitHub: github.com/TIGER-AI-Lab/Open… 🤗 Models & data: huggingface.co/collections/T… 🚀 Demo: huggingface.co/spaces/OpenRe… 🔎 Eval logs: huggingface.co/datasets/Open… #llms #agentic #deepresearch #tooluse #opensource #retrieval #SFT
3
4
65
18,120
Igor Gitman retweeted
Need to accelerate inference for math problem solving? Large language models can solve challenging math problems. However, making them work efficiently at scale requires the right serving stack, quantization strategy, and decoding methods—often spread across different tools. This @nvidia blog post shows how to build a fast, reproducible inference pipeline with the NVIDIA NeMo-Skills library to manage NVIDIA TensorRT-LLM. 🔗developer.nvidia.com/blog/ho… #PyTorch #OpenSourceAI #AI #Inference #Innovation
2
6
39
6,945
Igor Gitman retweeted
Nvidia paper behind Nemotron-Math, a massive math tutoring dataset so smaller Large Language Models can learn long, tool checked reasoning. It contains 7.5M step by step solutions, some as long as 128K tokens, meaning text pieces, written in 3 reasoning styles. This dataset also shows self checking, where the model runs Python code to avoid simple arithmetic mistakes. The authors mix competition problems from Art of Problem Solving with real questions from Mathematics Stack Exchange and MathOverflow. They use open model gpt-oss-120b as a teacher, generating multiple solutions per problem at high, medium, and low depth. For long context training, they sort examples by length and fine tune in stages, so most steps use shorter text before 128K. That schedule gives about 2-3x faster training with roughly 1-3% less accuracy, and the extra Stack Exchange problems make the trained models handle messier questions better. ---- Paper Link – arxiv. org/abs/2512.15489 Paper Title: "Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision"
12
35
214
15,003
Igor Gitman retweeted
15 Dec 2025
🚀🚀🚀 We’re excited to support @NVIDIA and their new open family of models: NVIDIA Nemotron 3! Open in weights, data, tools, and training, Nemotron 3 is built for multi-agent apps and features: ⚡️An efficient hybrid Mamba‑Transformer MoE architecture 🧾1M token context for long-term memory and improved reasoning 🧠 Multi‑environment reinforcement learning via NeMo Gym for advanced skill adaptation Plus NVFP4 pre-training, latent MoE, 1T tokens of data, and more! Read more about the model: blog.vllm.ai/2025/12/15/run-…
2
29
237
13,254
Igor Gitman retweeted
15 Dec 2025
NVIDIA Nemotron 3 Nano is live on OpenRouter! It is a small MoE reasoning model built for specialized agentic AI systems. Just like others in the Nemotron family, Nano 3 is fully open with: - Open weights, open data, & open recipes - Designed for customization & optimization
12
21
175
11,957
Igor Gitman retweeted
15 Dec 2025
This is not just another strong open model. Nemotron actually releases training data (!), RL environments, and training code. This is a big difference: almost all model developers just want people to use their models; NVIDIA is enabling people to make their own models. We are excited to incorporate these assets into the next Marin models! Congrats to the @nvidia team!
15 Dec 2025
Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.
31
181
1,600
157,538
Igor Gitman retweeted
15 Dec 2025
Nemotron 3 Nano by @nvidia, available now in LM Studio! 👾 > General purpose reasoning and chat model > MoE: 30B total params, 3.5B active > Supports up to 1M tokens context window > Hybrid arch: 23 Mamba-2 and MoE layers, 6 Attention layers Requires ~24GB to run locally.
13
38
452
47,193
Igor Gitman retweeted
NVIDIA has just released Nemotron 3 Nano, a ~30B MoE model that scores 52 on the Artificial Analysis Intelligence Index with just ~3B active parameters Hybrid Mamba-Transformer architecture: Nemotron 3 Nano combines the hybrid Mamba-Transformer approach @NVIDIAAI has used on previous Nemotron models with a moderate-sparsity MoE architecture, enabling highly efficient inference, particularly at longer sequence lengths Small-model improvements: with 31.6B total and 3.6B active parameters, Nemotron 3 Nano scores 52 on our Intelligence Index, in line with OpenAI’s gpt-oss-20b (high). This represents a 6 point lead on the similarly-sized Qwen3 30B A3B 2507 and 15 improvement on NVIDIA’s previous Nemotron Nano 9B V2 (a dense model) High openness: Nemotron 3 Nano follows other recent NVIDIA models in open licensing and releases of data and methodology for the community to use and replicate - it scores an 67 on the Artificial Analysis Openness Index, in line with previous Nemotron Nano models Key model details: ➤ 1 million token context window, with text only support ➤ Supports reasoning and non-reasoning modes ➤ Released under the NVIDIA Open Model License; the model is freely available for commercial use or training of derivative models ➤ On launch, the model is being made available with a range of serverless inference providers including @baseten, @DeepInfra, @FireworksAI_HQ, @togethercompute and @friendliai, and it is available now on Hugging Face for local inference or self-deployment See below for our full analysis and key announcement links from NVIDIA 👇
9
49
285
110,522
Igor Gitman retweeted
15 Dec 2025
🚀 Day-0 support for @NVIDIA Nemotron 3 Nano in SGLang SGLang now supports Nemotron 3 Nano on Day 0 🎉 A highly efficient, fully open Hybrid MoE model with 1M context, thinking budget, and industry-leading accuracy per compute. ✅ Open weights, data, and recipes ⚡ Fast, low-latency inference with SGLang 🧠 Built for agentic workflows, coding, and reasoning 👇 Get started in minutes with the SGLang Cookbook! Run BF16 / FP8, serve locally, and start building today. #SGLang #Nemotron #Day0Support #OpenSourceAI #LLM #Inference
4
18
115
20,826
15 Dec 2025
Despite the small size, this is by far the best model we've released! And as always, we don't just release the model, but we release pretty much everything you need to reproduce it. If you're interested in math reasoning, we have two new datasets for you to try. - Nemotron-Math, which is a collection of 350K "final-answer" math problems and 7.5M natural language solutions generated by gpt-oss-120b (with and without Python tool use and with all 3 reasoning regimes). Training on this data you can easily get a model to 100% on AIME 24/25 if you use majority voting and Python TIR. - Nemotron-Math-Proofs, which contains 580k natural language proof problems, 550k formalizations into theorem statements in Lean 4, and 900k reasoning trajectories from Goedel-Prover-v2 culminating in valid Lean 4 proofs (some theorems have multiple proofs). We weren't able to formalize all statements and we weren't able to prove all that we formalized, but we release everything, so that others can improve it. Doing simple SFT on this data can fully reproduce (and slightly improve) the accuracy of Goedel-Prover-v2 8B model on Lean benchmarks. You can find the datasets as well as more details here: - Nemotron-Math: huggingface.co/datasets/nvid… - Nemotron-Math-Proofs: huggingface.co/datasets/nvid…
15 Dec 2025
Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.
1
4
18
2,733
Igor Gitman retweeted
Ivan Sorokin and I are the official winners on the Arc Prize competition, with a significant lead over other teams. Thanks to @kaggle and @arcprize for hosting the competition. NVIDIA tech blog summarizing what we did: developer.nvidia.com/blog/nv… Our writeup: kaggle.com/competitions/arc-… Our code: github.com/1ytic/NVARC
41
56
572
85,613