Researchers propose Constructive Alignment to reframe AI alignment as governance of evolving human preferences
New paradigm argues alignment should focus on regulating how AI systems shape long-term value formation rather than optimizing static preferences.
1 source · cross-referenced
- A new paper introduces Constructive Alignment, a paradigm that reframes AI alignment as a control problem over evolving human preference trajectories.
- The framework models preferences as layered state variables that change through interaction with AI systems.
- Authors argue alignment should ensure value trajectories remain coherent, reflectively endorsed, and resistant to manipulation.
- Paper published as part of the AAAI-26 Workshop on Machine Ethics.
Researchers Max Kanwal and Caryn Tran propose Constructive Alignment, a paradigm that reframes AI alignment as a control problem over evolving human preference trajectories rather than the optimization of static preferences.
The paper challenges the conventional approach to alignment, which treats human preferences as fixed targets to be inferred and optimized. The authors cite empirical evidence showing preferences are layered, dynamic, and constructed through interaction—particularly with adaptive technologies like AI systems.
The framework models preferences as layered state variables that evolve under interaction with AI systems. It formalizes this view using a control-theoretic approach in which system actions and interaction design jointly influence both world states and human evaluative states.
The authors argue that alignment should focus on regulating how AI systems influence the evolution of human preferences, ensuring that value trajectories remain coherent, reflectively endorsed, epistemically grounded, bounded against manipulation, and empowering under uncertainty.
The paper positions alignment as a problem of governing long-term value formation rather than simply satisfying static preferences, emphasizing the role of AI systems in shaping what people attend to, value, and endorse over time.
The work draws on behavioral economics, psychology, and constructivist social theory to ground its theoretical contributions. It was published as part of the AAAI-26 Workshop on Machine Ethics and consists of 23 pages with one figure.
- Jul 2, 2026 · arXiv cs.AI
Researchers propose Bounded Morality framework to formalize moral computation under constraints
Trust79 - Jul 2, 2026 · arXiv cs.AI
Paper proposes MMM data model to improve knowledge interoperability across disciplines and systems
Trust79 - Jul 1, 2026 · arXiv cs.AI
Study proposes AI-driven method to discover reusable simulation models via natural language queries
Trust79