@printfresh co-founder. Living in and making Philly better.

Joined April 2008
35 Photos and videos
10 Sep 2025
An inspiration few days in Los Angeles. Thanks to these guys for putting this on. @friedberg
2
11
823
leovoloshin retweeted
11
12
68
136,441
13 May 2025
Does anyone know when @GeminiApp for workspaces will have memory? My personal @gmail account's Gemini already does but the workspace at our company does not...
4
466
2 Apr 2025
1
3
635
28 Jan 2025
What’s the best AI note taker for company-wide use? Something that has access control ideally. We’ve gone between read.ai, tactiq.io, and otter.ai (we run on google workspace). Ideally could be fed into some sort of master ai to continuously build context for what’s going on at the company.
6
7
1,196
leovoloshin retweeted
1) DeepSeek r1 is real with important nuances. Most important is the fact that r1 is so much cheaper and more efficient to inference than o1, not from the $6m training figure. r1 costs 93% less to *use* than o1 per each API, can be run locally on a high end work station and does not seem to have hit any rate limits which is wild. Simple math is that every 1b active parameters requires 1 gb of RAM in FP8, so r1 requires 37 gb of RAM. Batching massively lowers costs and more compute increases tokens/second so still advantages to inference in the cloud. Would also note that there are true geopolitical dynamics at play here and I don’t think it is a coincidence that this came out right after “Stargate.” RIP, $500 billion - we hardly even knew you. Real: 1) It is/was the #1 download in the relevant App Store category. Obviously ahead of ChatGPT; something neither Gemini nor Claude was able to accomplish. 2) It is comparable to o1 from a quality perspective although lags o3. 3) There were real algorithmic breakthroughs that led to it being dramatically more efficient both to train and inference. Training in FP8, MLA and multi-token prediction are significant. 4) It is easy to verify that the r1 training run only cost $6m. While this is literally true, it is also *deeply* misleading. 5) Even their hardware architecture is novel and I will note that they use PCI-Express for scale up. Nuance: 1) The $6m does not include “costs associated with prior research and ablation experiments on architectures, algorithms and data” per the technical paper. “Other than that Mrs. Lincoln, how was the play?” This means that it is possible to train an r1 quality model with a $6m run *if* a lab has already spent hundreds of millions of dollars on prior research and has access to much larger clusters. Deepseek obviously has way more than 2048 H800s; one of their earlier papers referenced a cluster of 10k A100s. An equivalently smart team can’t just spin up a 2000 GPU cluster and train r1 from scratch with $6m. Roughly 20% of Nvidia’s revenue goes through Singapore. 20% of Nvidia’s GPUs are probably not in Singapore despite their best efforts. 2) There was a lot of distillation - i.e. it is unlikely they could have trained this without unhindered access to GPT-4o and o1. As @altcap pointed out to me yesterday, kinda funny to restrict access to leading edge GPUs and not do anything about China’s ability to distill leading edge American models - obviously defeats the purpose of the export restrictions. Why buy the cow when you can get the milk for free?
224
1,425
8,966
3,350,311
13 Dec 2024
People’s inability to write a cover letter is astounding. I hope they realize that using ChatGPT to write one is easily spotted and an instant disqualification. That people would use this while applying for a $150k position is honestly shocking. Every single one has a third paragraph that starts with: What excites me most about Printfresh is…
2
13
701