DeepInfra

DeepInfra

85 Photos and videos

Tweets

DeepInfra

@DeepInfra

Jun 13

Step 3.7 Flash is Live on DeepInfra: An Agentic, Multimodal Model Built for Production

9,003

DeepInfra

DeepInfra

@DeepInfra

Jun 13

try it out here: deepinfra.com/stepfun-ai/Ste…

stepfun-ai/Step-3.7-Flash - Demo - DeepInfra

Step 3.7 Flash is an open-source multimodal reasoning model by StepFun with 198B total parameters (11B active) using Mixture of Experts. It accepts text and image inputs and features a 256K context...

deepinfra.com

382

DeepInfra

DeepInfra

@DeepInfra

Jun 12

We just added text-to-music on DeepInfra. ACE-Step v1.5 XL — open-source, full song generation from a text prompt. Vocals, lyrics, instrumentation. Quality that rivals commercial tools. We run the XL checkpoint with the planning step on by default, so it optimizes for musical structure and coherence. $0.001 / second of audio. @ACEStep_Music

2:57

940

DeepInfra

DeepInfra

@DeepInfra

Jun 12

Play around with it here 👉 deepinfra.com/ACE-Step/acest…

ACE-Step/acestep-v15-xl-sft - Demo - DeepInfra

ACE-Step v1.5 is a powerful open-source music foundation model that turns a text prompt into a complete song — vocals, lyrics, and instrumentation — at quality that rivals commercial tools. We run...

deepinfra.com

328

DeepInfra

DeepInfra

@DeepInfra

Jun 10

Big upgrade to @bria_ai_'s video background removal on DeepInfra — shipping today. 2x better quality · 9x faster · 33x cheaper 26 fps / 38ms per frame on L40S. Smarter foreground detection — now recognizes mics, desks, and products.

0:06

657

DeepInfra

DeepInfra

@DeepInfra

Jun 10

Play around with it here: deepinfra.com/Bria/video_rem…

Bria/video_remove_background - Demo - DeepInfra

Light and fast. Remove the background of your videos to bring the foreground elements to focus. No more unwanted distractions.. Try out API on the Web

deepinfra.com

362

DeepInfra

DeepInfra

@DeepInfra

Jun 6

Lol

0:28

1,745

DeepInfra

DeepInfra

@DeepInfra

Jun 6

9gag.com/gag/amoeE82 is the source.

Too realistic - Video

775 points • 82 comments

9gag.com

331

DeepInfra

DeepInfra

@DeepInfra

Jun 4

We just added @NVIDIA Nemotron 3.x to DeepInfra — Day 0. Two open and highly efficient models, live now: → Nemotron 3 Ultra: Frontier reasoning for long-running agents with, up to 5x faster inference and up to 30% lower cost → Nemotron 3.5 Content Safety: 4B multimodal, multilingual safety model with custom policy support, reasoning traces, and coverage across, 23 safety categories for enterprise AI guardrails → Nemotron 3.5 ASR:(Coming soon) 0.6B streaming model with ~40 language-locales. Built for agentic AI. Same API as everything else on DeepInfra.

2,085

DeepInfra

DeepInfra

@DeepInfra

Jun 4

Read more here: deepinfra.com/blog/nvidia-ne…

Nemotron 3 Ultra, 3.5 Content Safety and ASR models are now live on DeepInfra platform.

Low pay-as-you-go pricing. No long-term contracts. Simple APIs. Scale to trillions of tokens. 100 AI models.

deepinfra.com

515

DeepInfra

DeepInfra

@DeepInfra

Jun 4

Play around with it here: deepinfra.com/nvidia/NVIDIA-… deepinfra.com/nvidia/Nemotro… deepinfra.com/nvidia/Nemotro…

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B - Demo - DeepInfra

Nemotron 3 Ultra is built for, frontier reasoning, orchestration, coding agents, deep research, and complex enterprise workflows. It delivers up to 5x faster inference and up to 30% lower cost for...

deepinfra.com

389

DeepInfra

DeepInfra

@DeepInfra

Jun 3

NVIDIA Cosmos 3 is live on DeepInfra. The first open world foundation model for physical AI that reasons before it generates. Built for robots, AVs, simulation, synthetic data generation.

808

DeepInfra

DeepInfra

@DeepInfra

Jun 3

Cosmos 3 Nano: deepinfra.com/nvidia/Cosmos3…

nvidia/Cosmos3-Nano - Demo - DeepInfra

Cosmos3 is a world foundation model that unifies understanding and generation within a single Mixture-of-Transformer (MoT) architecture. Two tightly coupled towers—a Reasoner (vision-language model)...

deepinfra.com

560

X Freeze

DeepInfra retweeted

X Freeze

@XFreeze

May 30

Entire world: We need more GPUs Meanwhile, Jensen Huang:

1:00

505

663

12,873

1,421,134

MiniMax (official)

DeepInfra retweeted

MiniMax (official)

@MiniMax_AI

Jun 1

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscrib… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days

559

1,154

11,074

4,941,393

DeepInfra

DeepInfra

@DeepInfra

Jun 1

We are really excited about Nemotron 3 Ultra.

Artificial Analysis

@ArtificialAnlys

Jun 1

NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model We partnered with @nvidia to evaluate this model for intelligence and speed - these figures use the model’s BF16 weights, but as with Nemotron 3 Super the model will be made available in NVFP4 quantization as well for higher inference performance. ➤ New leader for US open weights intelligence: Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This is well ahead of the next strongest US open weights models, Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33), but behind the Chinese-led open weights frontier (Kimi K2.6 at 54). ➤ Leading speed for its intelligence: on a pre-release @DeepInfra endpoint, Nemotron 3 Ultra served over 300 tokens per second. Peer models in its size class from China-based labs such as DeepSeek and Moonshot (Kimi) are generally served at speeds of 50-100 tokens per second in the market today. gpt-oss-120b is served at speeds similar to this level, but with significantly lower intelligence. ➤ Largest Nemotron 3 model so far: at approximately 550 billion total parameters and 90% sparsity, Nemotron 3 Ultra is significantly larger than its siblings and is the largest recent US open weights model release We’ll be sharing additional analysis and full benchmarks at release.

528

Supermicro

DeepInfra retweeted

Supermicro

@Supermicro

May 16

CEO Charles Liang Keynote @ Supermicro Innovate!/COMPUTEX

0:10

supermicro.com

1,193

14,408,633

NVIDIA AI

DeepInfra retweeted

NVIDIA AI

@NVIDIAAI

Jun 1

Nemotron 3 Ultra is coming this week. ⌛️

2:11

105

355

3,304

389,098

DeepInfra

DeepInfra

@DeepInfra

May 14

The right question, and one too few enterprises are asking. Thanks @realmtbman and @palebluenexus for having our co-founder @nikolaborisof on. Full episode: youtu.be/DS2-iheW6pI

Serving 5 Trillion AI Tokens a Week: Inside DeepInfra with Nikola...

DeepInfra (https://deepinfra.com/) is serving over 5 trillion token...

youtube.com

Yohann Calpu

@realmtbman

May 13

Enterprises ask "is your AI compliant?" The better question: who actually runs the inference? Nikola Borisov, co-founder of @DeepInfra ($107M Series B raise - including NVIDIA) on @palebluenexus: "You want to make sure you're not giving it to someone that will give it to someone that will give it to someone. And maybe the final inference happens in China."

1:00

1,433

DeepInfra

DeepInfra

@DeepInfra

May 13

"I wasn't sure what we'd build. I just wanted to work with my co-founders. We ended up deciding to do AI infrastructure. It was a great choice." Our CEO @nikolaborisof on Scaling Without Breaking podcast: why the team came before the idea. youtube.com/watch?v=9siruL1p… Check it out on more platforms👇

Where is the Demand for AI?

AI-first companies don't want infrastructure. They want a partner....

youtube.com

684

DeepInfra

DeepInfra

@DeepInfra

May 13

Apple: podcasts.apple.com/us/podcas…

Breakthrough AI Operators

Management Podcast · Updated Weekly · Breakthrough AI Operators is a podcast about how the best startup founders are reinventing how their companies work. Not AI hype. Not vendor pitches. Real...

podcasts.apple.com

247