The World’s GPU Compute at Your Fingertips. Acquired by @NVIDIA.

Joined January 2024
Photos and videos
Lepton AI retweeted
30 Jun 2025
Xfinity: Make something nobody wants
111
17
1,099
197,744
13 Jun 2025
DGX Cloud Lepton is a new layer that standardizes AI inference across multiple cloud providers, offering a unified interface and automatic workload routing.
📣 Announcing a unified AI platform connecting developers to thousands of GPUs worldwide: NVIDIA DGX Cloud Lepton (Early Access). Build, train, and deploy AI apps at scale—faster and easier than ever. Learn more & join for early access: nvda.ws/4kOaxLV
1
3
2,162
Lepton AI retweeted
13 Jun 2025
7/ NVIDIA and Hugging Face offer DGX Cloud Lepton for instant global GPU access. Train, fine-tune, and deploy models at scale with ease. Fast, flexible, and collaborative.
2
4
81
19,359
Lepton AI retweeted
🚨 NVIDIA launches DGX Cloud Lepton to commoditize inference compute across clouds, threatening neocloud margins. DGX Cloud Lepton is a new layer abstracting inference compute across multiple neoclouds. It gives users a consistent interface while automatically routing workloads across providers. → The goal is to make inference compute a commodity, similar to what Uber did for taxi services. This strips differentiation from neoclouds and creates pricing pressure, reducing their margins. → Lepton’s real innovation is turning multi-cloud inference into a seamless, interoperable platform. It raises performance per dollar for users, while keeping NVIDIA’s margins untouched. @NVIDIAAIDev
3
5
21
3,111
Lepton AI retweeted
7 Nov 2024
We've achieved a >99.5% uptime for large scale GPU clusters, with a great collaboration between @LeptonAI and @digitalocean. This is much better than industry standard SLAs which roams around 98%. It's done via proactive monitoring solutions like our open source GPUD, the cloud native platform, and close collaboration between the engineering teams. Learn more at blog.lepton.ai/achieving-99-…, and shoot a message to info@lepton.ai if you need high performance, cloud native, production grade AI infra!

9
5
66
18,027
Lepton AI retweeted
Talk to Llama 3.2-3B 🦙🗣️⚡️ Powered by @LeptonAI (blazing fast LLM inference, ASR, and TTS all in one!) and @Gradio 's ergonomic WebRTC Streaming ⚡️ Building this took me about 30 minutes despite never using Lepton before.
1
2
1
913
Lepton AI retweeted
Achieving more than 99.9% uptime and quick turnaround times for collaboration between teams after partnering with #DigitalOcean, @LeptonAI’s CEO, Yangqing Jia, is realizing his goal of growing 10x over the next year. 🚀 Watch to learn how ⤵ youtube.com/watch?v=NLtQHgxb…
3
8
3,403
Lepton AI retweeted
18 Jun 2024
We are so proud to announce our extended partnership with FastGPU @fast_gpu via AI OG innovators, the mighty LeptonAI @LeptonAI . Now you can deploy on-Demand RTX4090’s with Enterprise AI Infrastructure IN SECONDS with Exabits on FastGPU. Just pay for what you use, as you go. ~a thread~
1
4
15
1,776
Lepton AI retweeted
11 Apr 2024
Introducing Samba-CoE v0.3, our latest Composition of Experts (CoE) model that surpasses DBRX by @DbrxMosaicAI and Grok-1 314B by @xAIGrokInu on the OpenLLM Leaderboard @huggingface! 🏆 Samba-CoE-v0.3 is now available on @LeptonAI @jiayq, try now: lepton.ai/playground/samba-c…. #AI
10
29
68
59,520
Lepton AI retweeted
25 Jan 2024
.@LeptonAI surpasses all other providers in throughput (P50 & P90) for both Llama-2-70B and Mixtral on a small service load for short input long output prompts. A P50 of 130 tks/s is the fastest throughput we've observed among all model offerings by all providers View this scenario live: leaderboard.withmartian.com/…
2
2
16
6,251