Vaibhav (VB) Srivastav

Vaibhav (VB) Srivastav

47 Photos and videos

Tweets

Ragavan retweeted

Vaibhav (VB) Srivastav

@reach_vb

5 Apr 2025

Meta COOKED! Llama 4 is out! Llama 4 Maverick (402B) and Scout (109B) - natively multimodal, multilingual and scaled to 10 MILLION context! BEATS DeepSeek v3🔥 Llama 4 Maverick: > 17B active parameters, 128 experts, 400B total parameters > Beats GPT-4o & Gemini 2.0 Flash, competitive with DeepSeek v3 at half the active parameters > 1417 ELO on LMArena (chat performance). > Optimized for image understanding, reasoning, and multilingual tasks Llama 4 Scout: > 17B active parameters, 16 experts, 109B total parameters > Best-in-class multimodal model for its size, fits on a single H100 GPU (with Int4 quantization) > 10M token context window > Outperforms Gemma 3, Gemini 2.0 Flash-Lite, Mistral 3.1 on benchmarks Architecture & Innovations > Mixture-of-Experts (MoE): First natively multimodal Llama models with MoE > Llama 4 Maverick: 128 experts, shared expert routed experts for better efficiency. Native Multimodality & Early Fusion: > Jointly pre-trained on text, images, video (30T tokens, 2x Llama 3) > MetaCLIP-based vision encoder, optimized for LLM integration > Supports multi-image inputs (up to 8 tested, 48 pre-trained) Long Context & iRoPE Architecture: > 10M token support (Llama 4 Scout) > Interleaved attention layers (no positional embeddings) > Temperature-scaled attention for better length generalization Training Efficiency: > FP8 precision (390 TFLOPs/GPU on 32K GPUs for Behemoth) > MetaP technique: Auto-tuning hyperparameters (learning rates, initialization) Revamped Pipeline: > Lightweight Supervised Fine-Tuning (SFT) → Online RL → Lightweight DPO > Hard-prompt filtering (50% easy data removed) for better reasoning/coding > Continuous Online RL: Adaptive filtering for medium/hard prompts All model on Hugging Face - time to COOK!

410

82,342

Arena.ai

Ragavan retweeted

Arena.ai

@arena

5 Apr 2025

BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400 on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control Huge congrats to @AIatMeta — and another big win for open-source! 👏 More analysis below⬇️

AI at Meta

@AIatMeta

5 Apr 2025

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9

359

2,171

526,432

Ragavan

Ragavan @ragavan

5 Apr 2025

Excited to ship the first set of Llama 4 models today. llama.com/4

174

Ragavan

Ragavan @ragavan

4 Oct 2024

Excited to share a research breakthrough from our team. It's fun, it's personal, it's customizable. Huge congrats to the team that worked hard to get to this milestone. ai.meta.com/blog/movie-gen-m…

How Meta Movie Gen could usher in a new AI-enabled era for content creators

Today, we’re excited to premiere Meta Movie Gen, our breakthrough generative AI research for media, which includes modalities like image, video, and audio.

ai.meta.com

AI at Meta

@AIatMeta

4 Oct 2024

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

0:30

195

Hemant Taneja

Ragavan retweeted

Hemant Taneja

@htaneja

2 Mar 2023

(1/6): We are entering the Age of Global Resilience. And today with my partner and @generalcatalyst MD Paul Kwan, we’re publishing our thesis on what this is and why we’re all-in: generalcatalyst.com/perspect…

Building Globally Resilient Systems for a New World Order

One renewed area of focus for General Catalyst is modern defense and intelligence where there is an urgent (long overdue) need, proven innovation playbooks and the opportunity to create enduring...

generalcatalyst.com

11,287

Ragavan

Ragavan @ragavan

11 Jan 2023

If 2022 was the year when many talented founders went down the web3 rabbit hole, 2023 will be the year when talented product people build AI-native products & businesses. Here’s why:

4,463

more replies

Ragavan

Ragavan @ragavan

11 Jan 2023

“Desktop” "Window" “File” “Program” "Download" "Copy/Paste" "Save" "Website" "Tab" “Homepage” "Link" “Online” "Browse" "Search" "App" "Homescreen" “Feed” “Notifications” "Swipe" "Share" "Message"

439

Ragavan

Ragavan @ragavan

11 Jan 2023

What are the AI-native frames that will define how consumers interact with intelligent software systems? What are the AI-native nouns & verbs that will form the vocabulary of this next generation of products? Are you building these today? We’d love to chat. Cc @generalcatalyst

433