Kaur, N and Choudhury, M and Pruthi, D (2024) Evaluating Large Language Models for Health-related Queries with Presuppositions. In: Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024, 11 August 2024-16 August 2024, Bangkok, pp. 14308-14331.
PDF
Pro_Ann_Mee_Com_Lin_2024.pdf - Published Version Restricted to Registered users only Download (2MB) | Request a copy |
Abstract
As corporations rush to integrate large language models (LLMs) to their search offerings, it is critical that they provide factually accurate information, that is robust to any presuppositions that a user may express. In this work, we introduce UPHILL, a dataset consisting of health-related queries with varying degrees of presuppositions. Using UPHILL, we evaluate the factual accuracy and consistency of InstructGPT, ChatGPT, GPT-4 and Bing Copilot models. We find that while model responses rarely contradict true health claims (posed as questions), all investigated models fail to challenge false claims. Alarmingly, responses from these models agree with 23-32 of the existing false claims, and 49-55 with novel fabricated claims. As we increase the extent of presupposition in input queries, responses from all models except Bing Copilot agree with the claim considerably more often, regardless of its veracity. Given the moderate factual accuracy, and the inability of models to challenge false assumptions, our work calls for a careful assessment of current LLMs for use in high-stakes scenarios. © 2024 Association for Computational Linguistics.
Item Type: | Conference Paper |
---|---|
Publication: | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
Publisher: | Association for Computational Linguistics (ACL) |
Additional Information: | The copyright for this article belongs to the publisher. |
Keywords: | Computational linguistics; Query languages, 'current; Health claims; Language model; Model response; Query response, Structured Query Language |
Department/Centre: | Division of Interdisciplinary Sciences > Computational and Data Sciences |
Date Deposited: | 26 Oct 2024 08:28 |
Last Modified: | 26 Oct 2024 08:28 |
URI: | http://eprints.iisc.ac.in/id/eprint/86555 |
Actions (login required)
View Item |