Filter
Exclude
Time range
-
Near
The ultimate AI engineer's toolkit for 2026. Every tool you need - organized by what it actually does. Bookmark this. you'll come back to it 🧵👇 𝜶. VECTOR DATABASES the backbone of any RAG or semantic search system. you need one of these the moment you start working with embeddings. @pinecone - fully managed, production-ready. least setup, most reliability. @weaviate_io - open-source with a clean GraphQL interface @qdrant_engine - built in Rust. fast, with powerful filtering support @trychroma - lightweight, ideal for local LLM development @milvusio - cloud-native, built for large-scale search @activeloop - AI data lake with versioning and multimodal support @vectara - managed RAG platform. retrieval generation in one place 𝜷. ORCHESTRATION & WORKFLOWS connecting LLMs, tools, memory, and data into pipelines that actually work. @LangChain - the most widely used LLM application framework @llama_index - purpose-built for connecting LLMs to your own data @deepset_ai - production-grade NLP pipeline framework @DSPyOSS - optimizes your prompts programmatically. no more guessing @langflow_ai - visual no-code builder for LLM workflows @FlowiseAI - drag-and-drop LLM chain builder 𝜸. PDF & DOCUMENT EXTRACTION turning unstructured documents into clean, LLM-ready data. Docling - converts PDF, DOCX, PPTX, HTML into structured Markdown/JSON pdfplumber - character-level PDF parsing and table extraction PyMuPDF - high-performance text and image extraction Unstructured - parses mixed document types into structured JSON Camelot - specialized in pulling tables out of PDFs Llama Parse - document parsing optimized specifically for LLM ingestion ExtractThinker - schema-mapped intelligent document extraction 𝜹. RAG FRAMEWORKS tools built specifically around Retrieval-Augmented Generation. RAGFlow - deep document understanding for open-source RAG PrivateGPT - fully local document Q&A using open LLMs AnythingLLM - all-in-one RAG app that works with any LLM backend Quivr - personal knowledge base powered by Generative AI txtai - embeddings database for semantic search and pipelines Llmware - lightweight RAG framework built for enterprise use cases 𝛆. EVALUATION & TESTING you cannot improve what you do not measure. Ragas - evaluates RAG pipeline quality end-to-end DeepEval - unit testing framework for LLM outputs Phoenix @arizeai - observability and tracing for LLM applications Opik - full DevOps-style evaluation and monitoring platform TruLens - tracks and evaluates LLM experiment runs Giskard - tests for bias, robustness, and safety in ML/LLMs 𝛇. MODEL MANAGEMENT & MLOps track experiments, version models, manage the full ML lifecycle. MLflow - industry standard for ML experiment tracking Weights & Biases @weights_biases - rich dashboards for model training and debugging DVC @dataversioncontrol - Git-style version control for data and models ClearML @ClearML - end-to-end MLOps with LLM pipeline support Hugging Face Hub @HuggingFace - central repo for models, datasets, and demos 𝛈. AGENT FRAMEWORKS tools to build agents that plan, use tools, and handle multi-step tasks. Google ADK - modular framework for building AI agents CrewAI @crewAIInc - orchestrates multiple role-playing AI agents LangGraph @LangChainAI - builds agents as controllable stateful graphs AutoGen @Microsoft - Microsoft's multi-agent conversation framework Pydantic AI - structured agent reasoning built on Pydantic Smolagents @huggingface - Hugging Face's lightweight agent framework Letta (MemGPT) @letta_ai - gives your agents persistent long-term memory Agno - agents with built-in RAG, workflows, and memory 𝛉. LLM FINE-TUNING adapt pre-trained models to your specific tasks and domains. Unsloth @unslothai - fine-tune LLMs faster using significantly less memory Axolotl - flexible post-training pipeline for open models LLaMA-Factory - streamlined fine-tuning for LLaMA-based models PEFT @huggingface - parameter-efficient fine-tuning to cut resource needs TRL @huggingface - reinforcement learning from human feedback (RLHF) Transformers @huggingface - Hugging Face's core library for pre-trained models DeepSpeed @Microsoft - helps run training jobs across many GPUs 𝛊. LOCAL DEVELOPMENT & SERVING run and serve models locally or self-host your own API. Ollama @ollama - run open-source LLMs locally in a single command LM Studio - desktop GUI for running and testing local models llama.cpp - lightweight inference engine across CPU and GPU LocalAI - self-hosted, OpenAI-compatible API server @LiteLLM - unified gateway for 100 LLM providers vLLM - fast inference and serving engine 𝛋. SAFETY & GUARDRAILS control, constrain, and stress-test your LLM apps before they go live. @guardrailsai - adds structured output validation and safety rails NeMo Guardrails @NVIDIA - NVIDIA's toolkit for programmable LLM conversation controls Garak - automated vulnerability scanner for LLMs DeepTeam - red teaming framework to pressure-test LLM applicationsthat's the full stack. save this thread, and share it with someone building with AI.
12
4
28
4,564
Still sharing datasets via USB sticks or spreadsheets? @YAmirghofran's step-by-step tutorial covers setting up DVC with @Cloudflare R2 — from install to your first dvc push. 👀 #dataversioncontrol hubs.la/Q043ChK00
2
6
196
29 Jul 2025
Superb piece by @alex_woodie of @BigDATAwireNews (aka @datanami) on the impact #dataversioncontrol is having on production #AI and #ML. Includes the details on our most recent round. Worth your time: bigdatawire.com/2025/07/29/l…
1
4
279
30 Jun 2022
Don't miss out on @ozkatz100, our Co-founder & CTO’s talk today at the @Databricks Data AI summit. Oz will be speaking about how to enable data versioning to billions of objects with lakeFS. #dataaisummit #dataversioning #dataengineering #DataVersionControl
6
5 May 2022
heise | Datenmanagement für KI: Machine-Learning-Versionierung mit Data Version Control heise.de/ratgeber/Datenmanag… #heiseplus #DataVersionControl

2
2
3 Jan 2022
The lakeFS community wishes everyone a great kick off of the year 2022! #datacommunity #community #2022ishere #lakefs #gitfordata #dataengineering #dataversioncontrol #dataversioning
2
27 Dec 2021
The lakeFS version 0.57.2 is out! This release improves the "merge" operation performance. Find further details at buff.ly/3HfXIpv #lakefs #dataversioning #dataversioncontrol #dataengineering #datalake #bigdata

1
7
11 Nov 2021
Meet one of the #lakeFS contributors — @sumukk! In the lakeFS forum Sumesh answers a few of our questions and tells the story about his contributions to the open source project lakeFS. #opensource #dataversioning #gitfordata #datalake #dataversioncontrol forum.lakefs.io/t/meet-a-lak…
1
2
5 Nov 2021
Did you know that "lakeFS" is a registered skill at LinkedIn?! Feel free to add it as a skill to your LinkedIn profile and let your peers and potential new employers know about it! #lakefs #gitfordata #dataversioning #dataversioncontrol #dataengineering #datalake #bigdata
1
3
DVC (data versioning control) is an open-source tool that makes data science and machine learning projects easy to reproduce and share. @appsilon bit.ly/3lCZsRW #datascience #machinelearning #dataversioncontrol

3
3
26 Feb 2021
A must read for everyone who really want to bring #ml models into production: Continuous Delivery for Machine Learning #cd4ml #mlflow #kubeflow #cicd #gocd #dvc #dataversioncontrol #mleap #onnx Credits to @dtsato @arifwider @intellification martinfowler.com/articles/cd…

1
2
buff.ly/3t81N98 🆕HACKER NOON TOP STORY >>How to Get Started with Data Version Control (DVC), by @CoachRyanV. #StoryOfTheDay #datascience #versioncontrol #dataversioncontrol #dataengineering #dagshub #gi
1
4
Great post from @murillodigital on Data Version Control (DVC) & an intro to DataOps which is 1st in a series of posts [to come] where he'll dig deeper into DVC, DataOps & ML pipelines. murillodigital.com/tech_talk… #DataOps #DevOps #MLOps #DVC #DataVersionControl

5
Huge kudos to the developers of Data Version Control, who realised that there was a need for git-style version control of data files. Easy to install and add to my collaborative workflow. Cheers! #dvc #dataversioncontrol #python #scientificcomputing
1
18 Feb 2020
What would you expect to see from a dataset diffing tool? Last Thursday we demo'd a rudimentary view of diffing in Qri Desktop (to come in future releases). Video available here: youtube.com/watch?v=h20sL8ye… #datadiffer #datadiffing #dataversioncontrol #dataversioning #gitfordata
1
2
8
18 Jun 2019
#PyCon 2019 | #MachineLearning Model And Dataset Versioning Practices #DataVersionControl liwaiwai.com/2019/06/19/pyco…

14
13
@FullStackML Hi! I want to start using your dataversioncontrol tool. Do you have a forum and/or complete API somewhere?
1
1
A new open source tool to explore! Data Version Control in Analytics DevOps Paradigm – dataversioncontrol @gvyshnya bit.ly/2tJAQcI
1
1
6