Google DeepMind announces AI co-clinician research initiative to augment physician care
The company published findings from a research project designed to support clinicians through evidence synthesis and patient interaction, claiming accuracy improvements over existing tools in blind evaluations.
1 source · single source
- Google DeepMind announced a research initiative exploring AI agents that assist patients under clinical supervision, framed as 'triadic care' involving AI, clinicians, and patients.
- The team adapted the NOHARM framework to evaluate AI co-clinician against errors of commission and omission, comparing it against evidence synthesis tools in 98 primary care queries.
- In blind evaluations by physicians, the AI co-clinician recorded zero critical errors in 97 of 98 test cases and was preferred over leading evidence synthesis tools in head-to-head comparisons.
Google DeepMind announced a research initiative exploring AI agents designed to function as collaborative members of clinical care teams. The company frames the concept as "triadic care," in which AI systems interact with patients while remaining under the judgment and authority of their supervising physician. The approach positions AI as expanding clinician capacity rather than replacing human expertise.
The research builds on prior work in medical AI, including MedPaLM (an examination-style medical knowledge system) and AMIE (a text-based consultation simulation that matched physician performance in real-world feasibility trials). The new co-clinician project extends this line of work by testing both clinician-facing and patient-facing interactions.
To evaluate the system's trustworthiness for clinical use, the team adapted the NOHARM framework—a method for assessing both errors of commission (incorrect information) and errors of omission (failure to surface critical information). The evaluation included 98 realistic primary care queries that were curated from diverse sources and refined by a panel of attending physicians, with scenario-specific answer metrics developed through expert consensus.
In head-to-head blind evaluations, physicians preferred AI co-clinician responses to those from leading evidence synthesis tools already in use. On objective analysis, the system recorded zero critical errors in 97 of the 98 test cases, outperforming two comparison AI systems commonly used by physicians. The source did not provide details on which specific tools were used as comparisons or full methodological documentation.
- May 1, 2026 · arXiv cs.AI
Researchers present Bayesian framework for replacing end-of-life language models in production
Trust69 - Apr 24, 2026 · arXiv cs.AI
New framework enables LLMs to discover and reuse skills for long-horizon game-playing tasks
Trust69 - Apr 24, 2026 · arXiv cs.AI
Researchers propose policy-grounded metrics to replace agreement-based evaluation in AI content moderation
Trust70