Models · Jun 17, 2026

AllenAI releases MolmoMotion, a language-guided 3D motion forecasting model with new dataset and benchmark

MolmoMotion predicts future 3D trajectories of object points from video frames and language instructions, outperforming prior methods. AllenAI also releases MolmoMotion-1M dataset and PointMotionBench benchmark.

Trust79

HypeLow hype

1 source · cross-referenced

ShareX LinkedIn Email

TL;DR

MolmoMotion predicts future 3D trajectories of object points from video frames and language instructions, outperforming prior motion forecasting methods.

Allen Institute for AI (AllenAI) released MolmoMotion, a model that forecasts future 3D trajectories of object points given a video frame, a sparse set of 3D points on an object, and a language instruction describing the intended action. The model outperforms existing motion forecasting methods on downstream tasks such as robotics planning and controllable video generation.

To support training and evaluation, AllenAI introduced MolmoMotion-1M, a dataset containing 1.16 million videos paired with 3D point trajectories and action descriptions, and PointMotionBench, a human-validated benchmark with 2.7K video clips designed to measure object-centric 3D motion forecasting accuracy.

MolmoMotion uses Molmo 2 as its backbone and represents motion as object-attached 3D points in world space, enabling class-agnostic, view-stable, and downstream-usable trajectory predictions. The model is available in two variants: an autoregressive model (MolmoMotion-AR) for well-defined future paths and a flow-matching model (MolmoMotion-FM) for representing uncertainty in ambiguous scenarios.

The dataset MolmoMotion-1M was constructed using an automated pipeline that extracts object-grounded 3D trajectories from unconstrained videos, filtering noisy tracks and segmenting clips to focus on meaningful motion intervals. The benchmark PointMotionBench was designed to evaluate forecasting accuracy with human validation.

Sources

01Hugging Face — MolmoMotion: Language-guided 3D motion forecasting

Also on Models

AllenAI releases MolmoMotion, a language-guided 3D motion forecasting model with new dataset and benchmark

DeepSeek releases V4-Flash-0731 with 304B parameters and enhanced agentic capabilities

OpenAI disrupts scam operation using ChatGPT in Cambodia

OpenAI offers free advanced ChatGPT access to 100,000 academic researchers