We had a great evening at the London @PyTorch Meetup, hosted by @Revolut and sponsored by HumanSignal.
Great to connect with the builder community and hear talks across foundation models, pretraining, model evaluation, and agentic AI workflows.
Thanks to everyone who joined and helped bring the event together!
SAM2 vs YOLO for Bounding Box Labeling: Which Should You Use?
- YOLO is best for fast bounding box generation when your categories are already defined.
- SAM2 is better when you need flexibility, precision, and detailed object boundaries.
The right choice depends on your dataset, labeling goals, category stability, and quality requirements.
Read more here: lnkd.in/gVeYzzyB
Less admin. More signal.
Recent product updates to Label Studio Enterprise tighten the loop between large-scale annotation, review, and quality measurement:
→ Assign members to projects in bulk
→ Label long PDFs page-by-page with thumbnail navigation
→ See label distribution in Analytics — catch class imbalance early
→ Measure agreement with Consensus or Pairwise methods, built for GenAI evaluation
Read the full changelog → humansignal.com/changelog/
Generative models don’t have stable “right answers,” so evaluation can’t rely on a single label.
Disagreement is part of the signal.
This breaks down how consensus helps make that measurable:
humansignal.com/blog/consens…
AI programs often have plenty of dashboards, metrics, and experiments.
Even then, it can be difficult to answer what’s working, what needs to change, and what is ready to ship.
This blog outlines a simpler way to measure what actually matters so teams can move faster and run systems with more clarity in production.
humansignal.com/blog/the-5-m…
LangSmith traces human judgment = better agentic AI
Now you can pull LangSmith traces directly into Label Studio and close the loop on AI quality.
Get the template ⤵️
docs.humansignal.com/tutoria…
Our new template for @langfuse gives you a powerful interface for evaluating agentic AI. 🪢
- Filter by user, assistant or tool
- Label for issues, verdicts, severity and expected behavior
New step-by-step tutorial in the Label Studio docs:
docs.humansignal.com/tutoria…
We spent last week at @PyTorch Conference Europe 2026 talking to teams building AI systems already in production.
Big shift in the conversations:
→ Less focus on models
→ More focus on evaluation, reliability, and real-world performance
We also co-hosted an Open Source AI Soirée with Docling, a room full of people sharing practical lessons from the field (see pictures below!)
The open source AI community is just getting started 🚀
Kicking off #PyTorchCon Europe 2026 tomorrow in Paris!
We’re hosting the Open Source AI Soirée with Docling tomorrow evening. If you’re working on training, evaluation, or production AI systems, it would be great to connect! Register here: luma.com/ya2wihmc
🇫🇷 Heading to the #PyTorch Conference in Paris? We are.
Label Studio is sponsoring PyTorch Conference Europe on April 7–8.
Come find us if you want to talk:
• LLM evaluation
• human feedback workflows
• dataset quality
🚀 Getting back into sharing more about what we’re building at Label Studio.
From LLM evaluation → human-in-the-loop workflows → dataset curation.
For more about AI systems, follow along 👇
labelstud.io
We’re excited to sponsor @PyTorch Conf EU!
Join the Label Studio and #Docling teams for drinks & bites to talk all things Open Source AI:
Tuesday, April 7th at 18:30 CEST
Register to attend. Venue is walking distance from Station F on the Seine. luma.com/ya2wihmc
Our latest open source release: Label Studio 1.23 🔥
• Vector annotation
• Interactive task source
• Improved data manager
Built for real-world data labeling and AI evaluation workflows.
Release notes & quick start: labelstud.io/blog/label-stud…
1/ 🚨 New Tutorial Alert! Dive into the world of Named Entity Recognition with our latest guide on fine-tuning generalist models like GLiNER for better performance.
3/ 🔧 Ready to fine-tune your NER model? Follow Mikaela Kaplan’s steps in our latest article to continuously improve your results. Check it out here:
labelstud.io/blog/fine-tunin…
1/ Integrating Large Language Models (LLMs) into data labeling is a game-changer for enhancing efficiency. Combining prompt refinement with data labeling significantly improves efficiency and transitions from static methods of dataset annotation to a dynamic, interactive model.
2/ In this tutorial, we introduce a prompt-centric workflow and demonstrate how it can be applied to classifying chat dialogue intent. The prompt-centric workflow streamlines the annotation process and continually enhances the quality of the dataset and prompt.
3/ We’ve also included a notebook for you to run this example yourself. From preliminary setup to connecting to the ML Backend, we guide you through the main steps. Check it out! labelstud.io/blog/automate-d…