Contrastive Cross-Lingual Calibration for Large Language Models

Marcello Conti

doi:10.64748/8gkysx36

Preprint / Version 1

Contrastive Cross-Lingual Calibration for Large Language Models

Authors

Marcello Conti Sapienza University of Rome

DOI:

https://doi.org/10.64748/8gkysx36

Keywords:

calibration, multilingual NLP, cross-lingual transfer, uncertainty, hallucination, evaluation

Abstract

Large language models (LLMs) are increasingly deployed in multilingual settings, yet their probability estimates are often miscalibrated, particularly for low-resource languages and code-switched inputs. We present C³, a post-hoc calibration framework that reduces cross-lingual miscalibration by optimizing language-aware temperature and bias parameters using contrastive counterfactuals generated via translation/back-translation and meaning-preserving perturbations. C³ aligns confidence across languages without retraining the base model. On classification (XNLI) and extractive/generative QA (XQuAD, MLQA, TyDi QA GoldP), C³ lowers Expected Calibration Error by 35–57% and Brier score by 9–18%, with modest accuracy gains (0.7–2.1 pp). For generative QA, hallucination rate decreases by 21% while maintaining answer quality. Benefits are largest for Swahili and Arabic and persist under code-switch and spelling noise. Ablations show that contrastive counterfactuals and language-specific scaling both contribute, and isotonic fusion improves tails of the confidence distribution. We release calibration recipes and evaluation scripts to support responsible multilingual deployment.

Author Biography

Marcello Conti, Sapienza University of Rome

Prof. Marcello Conti is a Full Professor of Computational Linguistics and Natural Language Processing. His research spans machine learning applications to semantic modeling, discourse analysis, and human–AI interaction. With a background in both linguistics and computer science, he has been at the forefront of developing multilingual corpora for low-resource languages. Prof. Conti is the coordinator of several Horizon Europe initiatives on AI-driven language technologies. He is also a frequent reviewer for ACL, COLING, and Computational Linguistics.

Downloads

Posted

2025-09-05

How to Cite

Contrastive Cross-Lingual Calibration for Large Language Models. (2025). In Substack Scholarly Posts. https://doi.org/10.64748/8gkysx36

Download Citation

License

This work is licensed under a Creative Commons Attribution 4.0 International License.