Looking for the cutting-edge of AI research? Follow Salesforce AI Research to see how we're transforming enterprise technology through advanced innovations. From world models to agentic systems, discover the future of AI before it hits the market.
Model cards are nutrition labels for AI. Now they include environmental impact. π±
@Salesforce is adding standardized energy carbon metrics to its AI model cards: sforce.co/4umu8qm
Salesforce AI Research worked with the Impact team to embed these estimates into the standard model evaluation workflow, so a model's footprint is measured alongside its performance. They cover energy use and emissions across pre-training, post-training, and inference, using the AI Energy Score methodology.
The Environmental Impact section is live now in the model cards for First Name Match, Account Match, and TextEval. Browse them on the Salesforce Trust site: sforce.co/4eaIwMu#ResponsibleAI#Sustainability#FutureOfAI
Excited to be at #CVPR2026 this week and to present my internship work with @SFResearch: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
In this work, we study how multimodal agents can actively reason over long videos by iteratively seeking the most relevant evidence, rather than passively processing all video content at once.
If youβre attending CVPR, feel free to stop by our poster!
π #245, Findings Posters, ExHall A
π Sunday, June 7
π’ 7:30 β 9:00 AM
Project page: activevideoperception.githubβ¦
Looking forward to connecting at CVPR!
#CVPR2026#ComputerVision#MultimodalAI#VideoUnderstanding#AIAgents
(1/8) Can Language Models Remember What They Learn? LLMs learn from feedback. But most post-training is amnesiac: rollout β reward β update β forget.
What if you keep the signal?
Procedural Memory Distillation (PMD): learning from experience, not just feedback. π§΅
(7/8) PMD doesn't give models a permanent notebook. It lets them use one while learning, absorb the useful lessons into their weights, and move on. Every training step contains signals about what works and what fails. Most methods throw it away.
π Paper: sforce.co/4dXlVTE
The 6th Multimodal Algorithmic Reasoning Workshop at #CVPR2026 is Thursday (6/4) morning ποΈ sforce.co/4ueOT7j
Bringing together researchers across academia and industry to explore advances in multimodal reasoning, foundation models, agentic reasoning, and the future of intelligent reasoning systems.
Keynote speakers:
πΉ Juan Carlos Niebles @jcniebles, Salesforce AI Research
πΉ Jiayuan Mao β U. of Pennsylvania
πΉ Melanie Mitchell β Santa Fe Institute
πΉ Jialong Wu β Tsinghua University
Room 601, Colorado Convention Center | 8:55 AM β 12:30 PM MDT
Thanks to Honglu Zhou @zhou_honglu (Research Scientist at @SFResearch) and sponsors @merl_news and @ElorianAI
Can Language Models Remember What They Learn? Introducing Procedural Memory Distillation (PMD): sforce.co/4dAjQOu
PMD turns model attempts into reusable training memory, conditions a self-teacher on it, and distills the guidance into the student's weights.
Accepted to #ICML2026: MFCL-Audio β a benchmark for voice agents that have to call tools.
Real speech is messy. Accents, background noise, mumbling, and "wait, what did I say?" moments all break tool calls. MFCL-Audio measures how badly, across 6.2K tasks.
Authors: Huanzhi Mao, Aditya Ghai, Imra Dawoodani, Tony Ginart, Shishir G. Patil, John Emmons, Joseph E. Gonzalez
#FutureOfAI#EnterpriseAI#VoiceAgents
π£ Counterparty Modeling is Not Strategy: The Limits of LLM Negotiators sforce.co/3RTTU7Q
New research finds LLM agents can model a negotiating partner's preferences accurately, but don't reliably turn that knowledge into strategic bargaining.
β‘οΈ Asymmetric information backfires: giving sellers the buyer's preferences raised buyer utility while seller utility fell
β‘οΈ Agents accurately read the room early on, but fail to convert this social understanding into reciprocal, multi-turn exchange
β‘οΈ Final deals are driven by opening price anchors rather than latent utility structure
β‘οΈ Forcing explicit give/ask trade plans doesn't close the gap, proving that a model's ability to reason about a variable doesn't mean it can execute it in interaction.
The problem isn't that models fail to read the room; they form accurate early beliefs about the opponent. The breakdown is downstream: they fail to convert social understanding into multi-turn strategic execution. Showing a capability in a reasoning trace does not mean the model can deploy it in sequential interaction.
Authors: Romain Cosentino @Rom_Cosentino, Sarath Shekkizhar @shekkizh, Adam Earle, Silvio Savarese @silviocinguetta#FutureOfAI#EnterpriseAI#AgenticAI
1/5 RLVR trains LLMs with pass/fail rewards β but every near-miss rollout is wasted.
What if models could actually *learn* from their mistakes?
New paper: "Learning from Language Feedback via Variational Policy Distillation"
Read: sforce.co/4uv2f0k π§΅π
4/5 Tested on 3 model families (Qwen3-4B/8B, Llama-3.1-8B) across code generation (LiveCodeBench) and scientific reasoning (SciKnowEval):
β Consistent gains over GRPO and self-distillation baselines
β Stable training where prior methods collapse
β Best gains on domains with rich error signals (code, science)
5/5 Binary rewards leave information on the table. Teaching models to interpret why they failed, not just that they failed, unlocks a complementary learning signal.
Paper:Β sforce.co/4uv2f0k
Authors: Yang Li @YangL95, Erik Nijkamp @erik_nijkamp, Semih Yavuz, @semih__yavuz, Shafiq Joty @JotyShafiq