Researchers propose TreeTracer, a visual analytics tool to detect hidden biases in large language models
TreeTracer aggregates hundreds of stochastic LLM outputs into syntax-aligned trees and uses contrastive inference to reveal representational harms that single-output audits miss.
1 source · cross-referenced
- TreeTracer is a visual analytics tool designed to evaluate bias in large language models (LLMs) by aggregating hundreds of stochastic generations into syntax-aligned hierarchical structures.
- The tool uses a perturbation analysis pipeline to replace ontology-defined terms in input prompts and visualizes results via custom Sankey diagrams for comparative analysis.
- Case studies comparing GPT-2 XL and Apertus models exposed hidden representational harms such as counterfactual pronoun suppression and conversational marginalization.
- A preliminary user study found the aggregated comparative interface reduces cognitive load and improves analysts' ability to detect systemic biases.
Researchers from an unnamed institution introduce TreeTracer, a visual analytics tool that evaluates bias in large language models (LLMs) by aggregating hundreds of stochastic text generations into syntax-aligned hierarchical structures. The approach addresses a key limitation of standard auditing methods, which often rely on inspecting single outputs or static automated metrics and thereby obscure the underlying probability distributions where hidden biases may reside.
TreeTracer employs a systematic perturbation analysis pipeline that replaces ontology-defined terms in input prompts, aggregates the resulting stochastic generations, and performs classification-aware node merging using an auxiliary language model. The aggregated structure is visualized through a custom Sankey diagram, enabling direct comparison between semantic contexts by juxtaposing two ontology-driven trees.
To mitigate the risk of misinterpreting visualized subsets of model behavior as definitive evidence of bias, the system applies contrastive inference to compute and display counterfactual token probabilities across contexts. This provides analysts with additional evidence to evaluate potential representational harms.
The researchers validate TreeTracer through case studies comparing an unaligned baseline model, GPT-2 XL, against constitutionally aligned Apertus models. The visual aggregation successfully exposed hidden representational harms, including counterfactual pronoun suppression and conversational marginalization of individuals.
A preliminary user study involving analysts found that the aggregated comparative interface reduces cognitive load and effectively supports the detection of systemic biases compared to traditional auditing methods.
- Jun 20, 2026 · Schneier on Security
KPMG retracts AI report after GPTZero finds 40 of 45 citations were hallucinated
Trust76 - Jun 19, 2026 · Schneier on Security
Malware developers embed policy-triggering text to disrupt AI-based analysis pipelines
Trust79 - Jun 18, 2026 · Google DeepMind — Blog
Google DeepMind unveils AI Control Roadmap to secure increasingly capable agents
Trust79