Preprint / Version 1

Contrastive Cross-Lingual Calibration for Large Language Models

Authors

DOI:

https://doi.org/10.64748/8gkysx36

Keywords:

calibration, multilingual NLP, cross-lingual transfer, uncertainty, hallucination, evaluation

Abstract

Large language models (LLMs) are increasingly deployed in multilingual settings, yet their probability estimates are often miscalibrated, particularly for low-resource languages and code-switched inputs. We present C³, a post-hoc calibration framework that reduces cross-lingual miscalibration by optimizing language-aware temperature and bias parameters using contrastive counterfactuals generated via translation/back-translation and meaning-preserving perturbations. C³ aligns confidence across languages without retraining the base model. On classification (XNLI) and extractive/generative QA (XQuAD, MLQA, TyDi QA GoldP), C³ lowers Expected Calibration Error by 35–57% and Brier score by 9–18%, with modest accuracy gains (0.7–2.1 pp). For generative QA, hallucination rate decreases by 21% while maintaining answer quality. Benefits are largest for Swahili and Arabic and persist under code-switch and spelling noise. Ablations show that contrastive counterfactuals and language-specific scaling both contribute, and isotonic fusion improves tails of the confidence distribution. We release calibration recipes and evaluation scripts to support responsible multilingual deployment.

Author Biography

  • Marcello Conti, Sapienza University of Rome

    Prof. Marcello Conti is a Full Professor of Computational Linguistics and Natural Language Processing. His research spans machine learning applications to semantic modeling, discourse analysis, and human–AI interaction. With a background in both linguistics and computer science, he has been at the forefront of developing multilingual corpora for low-resource languages. Prof. Conti is the coordinator of several Horizon Europe initiatives on AI-driven language technologies. He is also a frequent reviewer for ACL, COLING, and Computational Linguistics.

Downloads

Posted

2025-09-05

How to Cite

Contrastive Cross-Lingual Calibration for Large Language Models. (2025). In Substack Scholarly Posts. https://doi.org/10.64748/8gkysx36