MLflow

MLflow

523 Photos and videos

Tweets

MLflow

@MLflow

Jun 9

Genie → MLflow traces 👇 🔹 Ingest space conversations 🔹 One trace per conversation 🔹 Base layer for judges/QA 📕 Read the cookbook: mlflow.org/cookbook/genie-tr… #MLflow #Genie

351

MLflow

MLflow

@MLflow

Jun 8

Production trust gap: LLMs can leak PII, produce harmful content, or violate content policies. 👇 🔹 App filters don't scale across agent LLM calls 🔹 One bug in app code can skip them 🔹 AI Gateway = centralized, consistent guards Full video: youtube.com/live/A1aNFvApZv8… #AIGateway #LLMOps

1:18

379

MLflow

MLflow

@MLflow

Jun 4

MLflow 3.12 deep dive clip: why coding agents need tracing 👇 Yuki Watanabe walks through what shows up in the trace when you turn it on: 🔹 Every turn, tool call (Read, Bash, Edit), and sub-agent step 🔹 Token usage and latency per span, including cache breakdown

1:15

515

MLflow

MLflow

@MLflow

Jun 4

🔹 Full sessions grouped together, so long conversations stay debuggable 🔹 MLflow 3.12 tracing for Claude Code, Codex, Gemini CLI, OpenCode, Qwen Code, and OpenHands 🎥 Full webinar: youtube.com/live/A1aNFvApZv8… #MLflow #CodingAgents

Deep Dive into MLflow 3.12 Features for AI Observability and Quality

We continue making immense improvements for overall AI observabilit...

youtube.com

653

MLflow

MLflow

@MLflow

Jun 4

Genie fixes from failed evals 👇 🔹 Traces space config in 🔹 LLM suggests concrete edits 🔹 Shorten signal-to-patch time 📕 Read the cookbook: mlflow.org/cookbook/genie-sp… #MLflow #GenAI #Genie

445

MLflow

MLflow

@MLflow

Jun 3

MLflow 3.13.0: RBAC Admin UI for self-hosted servers 👇 🔐 Roles as reusable permission bundles 🖥️ Admin UI (no REST endpoints) 📦 Experiments, models, prompts, scorers, Gateway Release highlights: mlflow.org/releases/3.13.0/ #MLflow #MLOps

518

MLflow

MLflow

@MLflow

Jun 3

MLflow 3.13.0 is a major update that runs AI observability at scale, focusing on access control, the lifecycle of your trace data, and richer support for agents. 🙌 🔗Check out the highlights of the release: mlflow.org/releases/3.13.0/ #mlflow #opensource #linuxfoundation

672

MLflow

MLflow

@MLflow

Jun 2

LLM judges for Genie traces 👇 🔹 Built-in baseline judges 🔹 Custom SQL/semantics checks 🔹 Start on highest-risk traces 📕 Read the cookbook: mlflow.org/cookbook/genie-ev… #MLflow #Genie

632

MLflow

MLflow

@MLflow

Jun 1

Thousands of traces, no systematic way to spot bad agent runs. MLflow Automatic Issue Detection 👉 choose CLEARS categories, run analysis in three clicks, triage issues in the UI. 🔗 Learn more: mlflow.org/blog/issue-detect… #MLflow #LLMOps #GenAI

1:00

328

MLflow

MLflow

@MLflow

May 28

Trace eval Genie in MLflow 👇 🔹 Full Genie pipeline 🔹 MLflow traces judges 🔹 Tighten one pilot space first 📕 Read the cookbook: mlflow.org/cookbook/databric… #MLflow #Genie

366

MLflow

MLflow

@MLflow

May 27

Vibe-checking works until it doesn't. Change one prompt, break three behaviors—and you can't tell if you moved forward or backward. Eval-driven development in MLflow 👇 1️⃣ Trace — mlflow.openai.autolog() @mlflow.trace spans (latency, tokens, cost) 2️⃣ Evaluate prompts — mlflow.genai.evaluate(), make_judge(), Prompt Registry, optimize_prompts (GEPA) 3️⃣ Prod — same judges on live traces; agent dashboards for cost/latency/quality 🔗 Learn more: mlflow.org/blog/structured-a… #MLflow #LLMOps #GenAI

514

MLflow

MLflow

@MLflow

May 26

Right answer, wrong trace? MLflow TruLens Agent GPA scorers read the full span tree 👇 🔹 10 TruLens scorers: 6 Agent GPA 4 RAG 🔹 95% agent errors on TRAIL vs 55% 🔹 mlflow.genai.evaluate() w/ RAG Phoenix 🔗 Read more: mlflow.org/blog/mlflow-trule… #MLflow #TruLens #GenAI

360

MLflow

MLflow

@MLflow

May 26

Red-team LLM apps in MLflow 👇 🔹 Adversarial eval inputs 🔹 Safety scorers guidelines 🔹 Rerun after model/prompt changes 📕 Read the cookbook: mlflow.org/cookbook/red-team… #MLflow #GenAI

699

MLflow

MLflow

@MLflow

May 22

Claude Code can burn through dozens or hundreds of LLM calls in one session. MLflow 3.12.0 : route it through AI Gateway with two env vars for traces, budget alerts/limits, and guardrails. No SDK changes. 🛣️ Setup: mlflow server → Gateway endpoint → ANTHROPIC_BASE_URL to the claude-code proxy. Run claude as usual. Learn more 👉 mlflow.org/blog/gateway-clau… #MLflow #AIGateway #ClaudeCode

0:24

2,088

MLflow

MLflow

@MLflow

May 22

RAG eval end-to-end in MLflow 👇 🔹 Trace retrieve generate 🔹 Built-in retrieval/gen judges 🔹 Localize failure to a stage 📕 Read the cookbook: mlflow.org/cookbook/rag-eval… #MLflow #RAG

918

MLflow

MLflow

@MLflow

May 21

Catch this session at Data AI Summit (June 15-18, SF)! 🌟 Agent quality via vibe-checking breaks at scale. 🔁 MLflow self-evolving test harness 🧪 Bad-answer feedback → automated tests ✅ Coding-agent fixes vs. accumulated suite 🎤 Adam Gurary & Yuki Watanabe Session details: databricks.com/dataaisummit/… #MLflow #DataAISummit

258

MLflow

MLflow

@MLflow

May 21

Prompt lifecycle in MLflow 👇 🔹 Registry-backed versions 🔹 Eval-gated promotion 🔹 Rollbacks without guesswork 📕 Read the cookbook: mlflow.org/cookbook/prompt-e… #MLflow #GenAI

278

MLflow

MLflow

@MLflow

May 20

.@OpenHandsDev agents edit files, run commands, and browse the web on their own—but there’s no structured record of what happened or whether the result was good. MLflow connects via @opentelemetry to trace every step, evaluate runs with built-in judges, and route model traffic through AI Gateway for budget and usage control. Learn more 👉 mlflow.org/blog/mlflow-openh… #MLflow #OpenHands

Harness Your OpenHands Agent with AI Observability and Governance | MLflow

AI coding agents edit files, run commands, and browse the web autonomously, but what are they actually doing? Learn how to trace every step, evaluate output quality, and control LLM spending for...

mlflow.org

213

MLflow

MLflow retweeted

MLflow

@MLflow

May 19

New on the MLflow channel: evaluate a RAG agent end-to-end with Joana Mesquita, MLflow Ambassador 👇 📌 Prompt Registry production aliases 🔍 Traces with SME ground truth ⚖️ Ragas, Phoenix custom LLM judge Watch now: youtu.be/4wqkHroNGFQ Blog: medium.com/@joana.c.mesquita… #MLflow #RAG

541