ceo @bageldotcom. previously code monkey at amazon, cashapp, instacart. hiring - dms are open.

Joined January 2021
429 Photos and videos
Pinned Tweet
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
88
114
667
449,730
world cup where sovereign ai labs from countries compete live on their model benchmarks
6
146
This is a scary place to find ourselves in, and for many of us in decentralized systems and technologies, the exact thing that brought us into the industry. @AnthropicAI's export controls is the most anti-American thing I can possibly think of and is a dangerous precedent. Jake wrote a great take on this, and @GPC_xyz did our small part backing a great team in @bidhan and @bageldotcom. Decentralized compute and AI infra, along w/ sovereign OS AI is essential, and I hope a lot of people are waking up to that fact this AM. What a time to be alive...
Unlike many investors in crypto, I did not pivot to AI in the last few years. However, since 2020, I built some of the deepest understanding in this industry on the intersection of AI and decentralized networks (crypto, web3). From the start, it was very clear that AI models are a centralizing force and the biggest target for government control. That point became market fact last night, with @AnthropicAIโ€™s export control compliance. As an investor in decentralized AI, I know that d-networks are a counterbalance to this state of affairs. In particular, the starting point of sovereign, open, public, decentralized AI is the seemingly insurmountable compute problem. How are people supposed to source more industrial compute for frontier training than these huge trillion dollar companies? The answer is simple: there is enough commodity GPU compute in the world to compete on the frontier, but to make use of it we need new algorithms for training. Thatโ€™s what a few companies like @gensynai @PrimeIntellect @bageldotcom @Pluralis @NousResearch @MacrocosmosAI @covenant_ai set out to research, while everyone on the planet told them it was impossible. The result is that it is not only possible, but it can be cheaper and nearly as efficient as the alternative process. The second major problem is economic sustainability. Open source models are great, however, they are not economically viable as they donโ€™t have a business model. So far in decentralized AI, only @Pluralis has an answer โ€” by breaking up the weights of the model among participants, we create a business model for tokenized AI models. This is the moment of truth โ€” will AI become fully centralized and fall under censorship and unilateral government control? Or will the AI world realize the importance of public AI on open decentralized networks?
1
4
413
Jun 12
Thanks for the shoutout alongside Perplexity, Pika, and ElevenLabs. @TheRundownAI ๐Ÿ™Œ
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
1
4
306
Jun 11
if you haven't seen this message yet, unfortunately you're not working on something interesting.
3
117
Jun 10
the whole thesis of decentralized training is more raw FLOPs from consumer devices and less reliance on high memory. and diffusion is uniquely capable of utilizing more flops and less memory.
Jun 10
Diffusion is taking over the local/owned compute category by storm. DiffusionGemma architecture is significantly better for running local models.
1
1
4
409
Jun 10
Diffusion is taking over the local/owned compute category by storm. DiffusionGemma architecture is significantly better for running local models.
Meet DiffusionGemma! An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license. Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Hereโ€™s whatโ€™s new with DiffusionGemma: ๐Ÿ‘‡
1
11
2,431
Jun 6
restocked shirts, come get before we run out! @cvpr
1
15
1,026
Jun 5
.@cvpr come by the @bageldotcom booth before we run out of merch!
2
11
715
bidhan retweeted
I'm presenting Heterogeneous Decentralized Diffusion Models tomorrow at #CVPR2026! We train diffusion experts on separate single GPUs with no gradient sync, mixing DDPM Flow Matching objectives that each fit whatever data shard it owns. The trick: a closed-form, training-free schedule-aware ฮตโ†’v conversion fuses the mismatched objectives into one velocity space at test time, with a lightweight router picking which experts denoise what. It beats homogeneous ensembles on both quality and diversity. We see this as a step toward making large-scale generative training genuinely decentralized - no data center, no interconnect, just contributors with single GPUs. Come talk to us about where this goes next, including scaling it to video and world models ๐Ÿ‘‹ ๐Ÿ“„ Paper: arxiv.org/abs/2603.06741 ๐Ÿ“ Poster session 1, tomorrow, 10:45โ€“12:45, Exhibit Hall ๐Ÿ‘‹
4
11
641
bidhan retweeted
India has sent 96 people to America who started billion dollar companies. No one else is even close. There's only about 5 million Indians in America. Almost one in 50,000 of them is a unicorn founder! What a holy, special, beautiful people. I will always fight for them.
1,114
2,609
14,220
1,005,840
Jun 4
bagel labs news cvpr edition
1
9
342
Jun 3
Jun 3
heading to @CVPR with the Bagel Labs team. if you want some elite merch (we spent a month designing it), discuss world models, physical ai, distributed training -- I'm your guy ๐Ÿซก
1
2
11
979
bidhan retweeted
If youโ€™re over the age of 30 and going to tech week parties Why
71
22
422
101,322
Jun 3
heading to @CVPR with the Bagel Labs team. if you want some elite merch (we spent a month designing it), discuss world models, physical ai, distributed training -- I'm your guy ๐Ÿซก
1
1
16
1,742
May 31
what if I told you thereโ€™s a subsection of the AI industry which has 1000x the TAM of LLMs?
2
9
347
May 30
a friend let me know that bagel labs was trending yesterday
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
2
10
615
May 28
come to the bagel labs paper session and booth at CVPR!
we found decentralized diffusion framework work for video generative models too! Still an early attempt, and a lot of open research questions left to explore. Would love to dig into it more next week at CVPR :)
5
770
May 28
The quality of the videos themselves are far from SOTA like VEO, Seedance etc. But the point we wanted to prove is that video generation objective, with its nuances like temporal coherence, time dimension, character consistency etc can be trained in a distributed way without shared clusters. And the mathematical evidence shows that this recipe will scale.
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
1
3
23
2,182
bidhan retweeted
Today we're releasing Paris 2.0, to our knowledge the first decentralized-trained video generation model. At Bagel Labs, we believe frontier models should not require homogeneous clusters of premium, supply constrained GPUs. Paris 1.0 proved this for image generation. Paris 2.0 extends the recipe to video generation and lays the substrate for global-scale world models. To test the approach, we trained two models head-to-head in an iso-FLOP, iso-data comparison. One was a monolithic model trained conventionally, on a single premium GPU cluster. The other was Paris 2.0, trained across an extreme mix of GPU types, generations, and vendors distributed around the globe. Against the monolithic model under matched data and compute, the results were: FVD: 561.04 โ†’ 279.01 (a ~2x improvement) CLIP text-video alignment and aesthetic score both improved. To our knowledge, this is the first distributed training architecture to surpass its monolithic counterpart under matched data and compute. Technical Report: arxiv.org/abs/2605.26064 Model Weights: huggingface.co/bageldotcom/pโ€ฆ
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
4
6
30
5,675
bidhan retweeted
we're releasing Paris 2.0, the first video generation model trained across decentralized GPUs instead of relying on one massive expensive cluster, Paris 2.0 was trained on a mix of GPUs distributed around the world - and it outperformed the traditional setup by ~2x so proud of our team ๐Ÿฅฏ
May 28
We're releasing Paris 2.0, which, to our knowledge, is the world's first decentralized trained video generation model. We benchmarked it against a monolithic model trained on the same data and compute budget, and Paris 2.0 outperformed the monolithic by ~2x on FVD benchmark.
1
2
11
875