AllenAI releases MolmoMotion, a language-guided 3D motion forecasting model with new dataset and benchmark
MolmoMotion predicts future 3D trajectories of object points from video frames and language instructions, outperforming prior methods. AllenAI also releases MolmoMotion-1M dataset and PointMotionBench benchmark.
1 source · cross-referenced
- MolmoMotion predicts future 3D trajectories of object points from video frames and language instructions, outperforming prior motion forecasting methods.
Allen Institute for AI (AllenAI) released MolmoMotion, a model that forecasts future 3D trajectories of object points given a video frame, a sparse set of 3D points on an object, and a language instruction describing the intended action. The model outperforms existing motion forecasting methods on downstream tasks such as robotics planning and controllable video generation.
To support training and evaluation, AllenAI introduced MolmoMotion-1M, a dataset containing 1.16 million videos paired with 3D point trajectories and action descriptions, and PointMotionBench, a human-validated benchmark with 2.7K video clips designed to measure object-centric 3D motion forecasting accuracy.
MolmoMotion uses Molmo 2 as its backbone and represents motion as object-attached 3D points in world space, enabling class-agnostic, view-stable, and downstream-usable trajectory predictions. The model is available in two variants: an autoregressive model (MolmoMotion-AR) for well-defined future paths and a flow-matching model (MolmoMotion-FM) for representing uncertainty in ambiguous scenarios.
The dataset MolmoMotion-1M was constructed using an automated pipeline that extracts object-grounded 3D trajectories from unconstrained videos, filtering noisy tracks and segmenting clips to focus on meaningful motion intervals. The benchmark PointMotionBench was designed to evaluate forecasting accuracy with human validation.
- Jun 17, 2026 · Google AI — Blog
Google DeepMind’s AMIE matches primary care physicians in disease management study
Trust78 - Jun 17, 2026 · Hugging Face
GLM-5.2 released with 1M-token context and long-horizon coding improvements
Trust79 - Jun 16, 2026 · OpenAI — News
OpenAI unveils Deployment Simulation to predict model behavior before release
Trust72