Microsoft Research proposes generative causal testing to explain language-related brain activity
New framework distills black-box brain-prediction models into testable hypotheses and validates them with fMRI experiments.
1 source · cross-referenced
- Researchers introduce generative causal testing (GCT), a method to translate black-box AI models of brain activity into readable explanations.
Microsoft Research, in collaboration with the University of California, Berkeley, the University of California, San Francisco, and Columbia University, introduced generative causal testing (GCT), a framework designed to address the explainability crisis in language neuroscience.
GCT translates black-box predictive models of brain activity—trained to match human fMRI responses to language—into concise, readable explanations of what specific cortical regions respond to, such as phrases like “food preparation” or “location names.”
The method operates in two steps: explanation and verification. First, an LLM identifies phrases that strongly drive a brain region’s predicted response and summarizes them into a candidate explanation. Then, the LLM writes new synthetic stories engineered to activate the targeted brain area, which are presented to subjects in an fMRI scanner.
If the targeted region’s activity significantly exceeds baseline in response to the synthetic stories, the explanation is confirmed causally, not just correlatively. Across three subjects, GCT reliably drove target regions above baseline, validating the approach where underlying brain-prediction models were strongest.
The researchers applied GCT to regions with known selectivity, recovering established findings such as strong responses in place-processing areas (RSC, OPA, PPA) to location-related content. They also identified responses in less-studied regions, such as activation in ventral occipital cortex near the fusiform face area (FFA) in response to “food preparation.”
GCT further distinguished neighboring place-processing regions (RSC, PPA, OPA) that were previously treated as functionally similar. By generating differential stimuli, GCT teased apart their specific sensitivities, such as RSC’s stronger response to proper noun location names.
The framework also revealed tiny prefrontal “micro-regions” tuned to specific concepts like dialogue, clock times, and measurements, suggesting new directions for neuroscience research.
- Jun 27, 2026 · Microsoft Research
Microsoft-led Talos system recovers 90% of rare disease diagnoses with 1.3 variants per patient for review
Trust79 - Jun 26, 2026 · arXiv cs.CL
Post-training helpfulness degrades compassion values more than coding training in Llama 3.1 8B
Trust79 - Jun 26, 2026 · arXiv cs.CL
LLMs show strong performance on text-only statics problems but struggle with diagrams and multi-step reasoning
Trust79