Microsoft-led Talos system recovers 90% of rare disease diagnoses with 1.3 variants per patient for review
Open-source genomic reanalysis tool reduces manual review burden while maintaining high diagnostic yield across multiple cohorts.
1 source · cross-referenced
- Talos, an open-source system for automated genomic reanalysis, recovered 90% of in-scope rare disease diagnoses while surfacing only 1.3 candidate variants per patient for expert review.
- In a prospective cohort of nearly 5,000 undiagnosed patients, Talos delivered 241 new diagnoses (5.1% additional yield) with an average 32-day lag between evidence publication and diagnosis.
- The tool was validated on two independent cohorts (1,089 probands total) and matched Exomiser’s sensitivity while returning far fewer candidates for review.
Microsoft Research and collaborators developed Talos, an open-source system for automated, iterative reanalysis of genomic data to address a bottleneck in rare disease diagnosis caused by manual review time. The tool is designed to efficiently re-examine stored sequencing data as scientific knowledge evolves, flagging variants with newly actionable evidence while maintaining a low false-positive rate.
Across a validation set of nearly 1,100 patients, Talos recovered 90% of in-scope diagnoses while flagging only 1.3 candidate variants per patient for expert review. This performance was consistent across two independent cohorts: the Australian Acute Care Genomics cohort of critically ill infants and children, and the U.S.-based Rare Genomes Project cohort of families with prior uninformative testing.
In the Australian cohort, Talos recovered 90% of in-scope diagnoses at a median of 1.3 candidate variants per family, while in the U.S. cohort it recovered 87% of in-scope diagnoses (47 of 54) at the same median rate of 1.3 candidates per trio. The missed diagnoses were largely due to Talos’s conservative strategy, such as recessive variants lacking ClinVar support that human analysts had classified using additional evidence like trans configuration or functional studies.
Talos matched the sensitivity of Exomiser, a widely used prioritization tool, but operated at a distinct point in the trade-off space: Exomiser returns a broad ranked list, whereas Talos surfaces a short, highly curated set of candidates. This design choice reflects Talos’s focus on reducing human review time, which it achieves by returning only variants whose supporting evidence has changed since the previous cycle.
Deployed prospectively across a cohort of nearly 5,000 undiagnosed patients, Talos delivered 241 new diagnoses, representing a 5.1% additional yield. The system demonstrated a rapid turnaround, with an average of 32 days elapsing between supporting evidence becoming public and the resultant diagnosis. When run on monthly iterative cycles, analysts only needed to review one new variant per 200 patients, highlighting the scalability of frequent, systematic reanalysis.
Talos integrates two continuously updated public resources—PanelApp Australia for gene–disease relationships and modes of inheritance, and ClinVar for variant-level pathogenicity—to reinterpret a patient’s existing variant calls against the latest community knowledge. It applies a variant-prioritization algorithm designed to surface variants most likely to meet ACMG/AMP criteria for clinical reporting, and can interpret single-nucleotide variants, small insertions/deletions, copy number variants, and large structural variants from exome or genome data.
- Jun 27, 2026 · Microsoft Research
Microsoft Research proposes generative causal testing to explain language-related brain activity
Trust79 - Jun 26, 2026 · arXiv cs.CL
Post-training helpfulness degrades compassion values more than coding training in Llama 3.1 8B
Trust79 - Jun 26, 2026 · arXiv cs.CL
LLMs show strong performance on text-only statics problems but struggle with diagrams and multi-step reasoning
Trust79