Filter
Exclude
Time range
-
Near
Yo, with the help of X algorithm, I will be going on a 30days challenge to connecting with people who are interested in : - Frontend - Backend - Full stack - DevOps - AI/MLData Science - UI/UX - App Development - Blockchain - React/ Nextjs - Javascript/Typescript #letsconnect
4
71
8/9 Operational polish: ✅ resumable downloads (mldata pull …) ✅ incremental builds (--incremental) ✅ auth helpers (mldata auth …) ✅ config layers (./mldata.yaml, ~/.config/mldata.yaml, …) ✅ diagnostics (mldata doctor)
1
14
7/9 Reproducibility is first-class: every build produces a manifest. Rebuild from it: mldata rebuild ./imdb/manifest.yaml --output ./imdb-rebuilt Compare builds: mldata diff … detect drift via PSI.
1
5
6/9 Export like you mean it: mldata export ./imdb --formats parquet,csv,jsonl --compression zstd:3 Need loader code? Generate PyTorch / TensorFlow / JAX templates.
1
11
5/9 Quality checks reports: mldata validate ./imdb --checks all --report ./report.md Bonus: file integrity checks for images/audio optional sampling for big datasets.
1
5
4/9 One-command dataset build (end-to-end): mldata build hf://stanfordnlp/imdb --output ./imdb --format parquet --split 0.8,0.1,0.1 --seed 42
1
6
3/9 Install sanity check: pip install mldata-cli mldata version Then discover datasets: mldata search "sentiment analysis" --source hf --limit 10
1
7
2/9 If dataset work feels like: source sprawl broken downloads format chaos “how did we build this?”… mldata-cli turns it into a pipeline: fetch → normalize → validate → split → export.
1
9
1/9 Introducing mldata-cli v0.4.0, a unified, reproducible CLI to acquire prep ML datasets from HuggingFace, Kaggle, OpenML, and local files. Repo: github.com/NoeFlandre/mldata…
1
1
43
🚨 Hiring: Data Engineer (India 🇮🇳 | AI Lab) Mercor is hiring a Data Engineer for a leading AI research lab 🧠 You’ll build resilient ETL/ELT pipelines, enforce data contracts, and power ML-ready datasets at scale. 🔧 Stack: Python, SQL, pandas, Postgres, Spark/DuckDB, Airflow 💡 If you love clean schemas, versioned data & distributed pipelines — this is for you. Apply now 👉 work.mercor.com/jobs/list_AA… #Hiring #DataEngineering #AIJobs #MLData #Python #Spark #RemoteJobs

98
As the year begins, our focus remains unchanged: thoughtful execution, consistent quality, and data our clients can rely on. Thank you to the teams who trust us to support their work with care and precision. #DataQuality #AIInfrastructure #LongTermPartnership #MLData
2
12
17 Nov 2025
Robotics datasets look flawless in sim, but hit real-world dust and it's a 25% accuracy nosedive. @huggingface Face reports annotation errors tank 3x more deploys than code bugs. What's the edge case that's derailed your latest build? Reply with your story – let's swap fixes. #MLData #Robotics
25
11 Nov 2025
60% of your model's flops? Not code – it's junk labels from rushed crews. Hugging Face stat: Annotation errors tank deploys 3x more than bad algos. Who's debugging 'why this edge case ghosted' this week? #MLData #AIHype
25
Is your current annotation partner keeping up? Fraction AI provides modern, flexible solutions designed for the dynamic demands of state-of-the-art AI models, especially in text generation. @FractionAI_xyz #StateoftheArtAI #MLData #tech
5
4
48
Boost your ML projects with LabelGPT! 🚀 Generate labeled data at lightning speed, accelerating your machine learning tasks 99 times faster! #LabelGPT #MLData #AI More awesome AI tools at: appsandwebsites.com/director…
12
17 Sep 2025
Thanks modi for making mobile data super chip by colabratimg and co-pionering cheap mldata and internet with reliance,this has boosted startups and start up economy systems in India,paved a way for upi,coddling digital device and gap which lifted millions out of poverty
1
6
Precise BPO Solution offers fully managed, on-demand data labeling and human evaluation services tailored for high-performance AI. Your trusted data labeling partner. #DataLabeling #DataAnnotation #MachineLearning #AITrainingData #ArtificialIntelligence #MLData #ComputerVision
22