OLESIA KHRAPUNOVA. Unified Benchmark for Evaluating Performance, Bias, and Consistency in LLM Binary Question Answering. International Journal of Computer (IJC), Jordan, v. 56, n. 1, p. 319–338, 2025. Disponível em: https://ijcjournal.org/InternationalJournalOfComputer/article/view/2470. Acesso em: 9 jan. 2026.