No evidence of Semitic-specific cross-lingual transfer in large language models
Fine-tuning on Arabic and inference-time reasoning improve zero-shot reading comprehension across languages, but not due to linguistic relatedness.
1 source · cross-referenced
- Seven LLMs (4B–671B parameters) fine-tuned on Arabic showed no Semitic-specific transfer in zero-shot reading comprehension across languages.
- Models with weak baselines improved dramatically across all languages, while strong baselines showed only marginal gains regardless of language family.
- Chain-of-thought reasoning provided similar benefits to fine-tuning, suggesting gains stem from task-format alignment rather than cross-lingual knowledge transfer.
Researchers fine-tuned seven large language models ranging from 4B to 671B parameters on Arabic and evaluated zero-shot reading comprehension across Semitic and non-Semitic languages. The study included both dense and Mixture-of-Experts architectures.
Across architectures, the authors report no evidence of Semitic-specific transfer. Models with weak baselines improved dramatically across all languages after fine-tuning, while models with strong baselines showed only marginal gains regardless of language family.
A chain-of-thought ablation reinforced these findings: the same models that benefited most from fine-tuning also benefited equally from inference-time reasoning. The authors interpret this as evidence that both mechanisms address task-format alignment rather than cross-lingual knowledge transfer.
- Jun 19, 2026 · arXiv cs.CL
LLM ensemble achieves 0.74 F1-score in automating EQ-5D study detection from PubMed abstracts
Trust79 - Jun 19, 2026 · arXiv cs.CL
Frontier LLMs hit ceiling on VerilogEval hardware-coding benchmark, study finds
Trust79 - Jun 19, 2026 · arXiv cs.AI
Systematic study compares diffusion language models to next-token LLMs across eight benchmarks
Trust79