When people share a space, their movements become intertwined. Embodied agents need to understand these social dynamics to interact effectively.
Introducing MAGNet 🧲, a unified autoregressive diffusion forcing model for multi-agent motion generation that captures these interactions.
MAGNet is flexible: predict the future, fill in missing motion, or have people react to each other, all while naturally scaling to N>2 people and generating ultra-long motion sequences.