Filter
Exclude
Time range
-
Near
colab here with an example, enough to enjoy something nice with datago and WDS: the streaming starts almost instantly, courtesy of a nice property of tar: you don't need to pull the whole tarball to decode some of the content colab.research.google.com/dr…
1
3
122
Quick update to datago to start 2026 on the right foot: loading data faster than ever in the webdataset format, direct from remote storage :) Attached is a comparison with python webdataset on a EPYC server with v2026.1.2, on PD12M. Both provide PIL images in python & metadata
1
4
11
1,631
Very narrow project in this sea of horrible news, still grinding the speed on datago (@photoroom_app infra project), open source dataloader for python written in rust, focused on images. Now 3000 img/s (IN1k) on an old laptop (code in repo), fast enough @giffmana? :)
4
1
13
1,705
1 Aug 2025
Datago - Branding & Logo ⚙️
1
1
5
68
1 Aug 2025
Datago - Branding & Logo 📦
1
1
12
420
31 Jul 2025
Datago - Branding & Logo ⚙️
1
1
5
223
31 Jul 2025
Datago - Analytics Landing Page 🖥️
1
1
5
68
28 Jul 2025
Datago - Analytics Landing Page 💻
2
1
10
103
28 Jul 2025
Datago - Analytics Landing Page 💲
1
1
8
250
This shipped since, datago wheels are webdataset compatible out of the box. Example in the repo, early benchmarks shows it to be much faster than the python lib (without the need for extra python processes), but your mileage may vary so best is to give it a shot :) cc @lhoestq
Batteries included webdataset dataloading, direct from python and without any subprocesses :) (this is for you @vikhyatk) Wheels already available if you're interested, you can go fish them from the GHA link. Not tested at scale yet, but well, it's Rust github.com/Photoroom/datago/…
1
2
7
686
🚀 #DolphinDB is proud to co-sponsor #QuantTalks Hong Kong 2025 with #Datago! 📅 May 14 | 🕔 4:30–9PM | 📍 Central Hong Kong Hear from top experts on #AI, #NLP, market sentiment & real-time infrastructure. 🔗 Register: dolphindb.com/events/quant-t…
1
5
98
Replying to @BenTheEgg
haha "full rewrite in rust" is such a nerd thing! But I get the motivation. The speeds mentioned, both from datago but also torch, are pretty abysmal though! We usually got 1-2k img/sec. However, as long as it creates a batch faster than one train step, who cares.
1
93
This is landed, `pip install datago` (starting from 2025.3.1) is now in Rust. Small interfaces changes, sorry about that, trying to stay close to the root language (Golang and Rust best practices differ). Filesystem support is still there, and if anything it's faster 😅
datago rewrite in Rust, will probably post a learnings thread on bsky, but let's say Rust is much nicer to write than in my souvenirs (maybe just means I'm old and wise, no idea). WIP but feature wise everything work, needs plumbing now github.com/Photoroom/datago/…
1
1
4
492
gran túmulo funerario. Se estima una cronología de unos 2600 años. Los pueblos de cultura celta asentados en el sudoeste de Alemania se caracterizan por la elevación de "túmulos funerarios príncipescos", de entre los que destaca el del Riedlingen datago entre el 620 a.C. y el
1
1
12
131
Replying to @Poyonoz
Loading and deciphering the jpeg, but datago support resize and aspect ratio bucketing. It wouldn’t slow things down here I think, because looks like I’m IO limited (M2 SSD), the CPU is not even maxed out
1
3
78
Computers are fast these days.. I love python but I think we're sitting on some perf around dataloading. Getting 1000img/s on IN1k on a laptop using datago, and that's without pre-processing a la FFCV, just lowering the whole process and exposing an iterator.
2
1
19
2,126
Datago updated with some CI, missing CD to autodeploy to pypi (older version up). github.com/Photoroom/datago People in my TL interested, what would be more useful ?
20% Dummy DB service in tests
80% Open sourcing real DB
5 votes • Final results
1
2
283
25 Sep 2024

17
15
262
75,625
There's a pre-built binary on pypi (pip install datago), requires libvips and libjpeg-turbo, only compatible with 3.11 at the moment (needs CD..).
1
3
388