TY - GEN
T1 - Practical Fact Checking System for LLMs
AU - Fuchs, Gilad
AU - Zinman, Oded
AU - Ben-Shaul, Ido
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/5/23
Y1 - 2025/5/23
N2 - The use of Large Language Models (LLM) like ChatGPT in real-world product solutions is significantly limited by the well-known issue of hallucinations. Various methods exist in order to overcome this issue automatically, such as using a different LLM to provide feedback on the accuracy of the generated text or by examining the output consistency given multiple sampled responses. However, these approaches do not guarantee factual accuracy, which is crucial in many specialized domains. In order to enhance the factual correctness of generated text by LLM models, a combination of manual annotations and supportive tools are required. We suggest a practical fact checking system tailored specifically for LLMs which combines a hybrid approach (human and machine) to evaluate the correctness of the generated text. This is particularly vital in fields where hallucinations present significant challenges. We use proprietary LLMs, both directly and through Retrieval-Augmented Generation (RAG), to offer users informed feedback on potential hallucinations via a user-friendly interface. We apply our methodology to the task of generating aspect values for video games in listings from an E-commerce marketplace, demonstrating the utility of our approach.
AB - The use of Large Language Models (LLM) like ChatGPT in real-world product solutions is significantly limited by the well-known issue of hallucinations. Various methods exist in order to overcome this issue automatically, such as using a different LLM to provide feedback on the accuracy of the generated text or by examining the output consistency given multiple sampled responses. However, these approaches do not guarantee factual accuracy, which is crucial in many specialized domains. In order to enhance the factual correctness of generated text by LLM models, a combination of manual annotations and supportive tools are required. We suggest a practical fact checking system tailored specifically for LLMs which combines a hybrid approach (human and machine) to evaluate the correctness of the generated text. This is particularly vital in fields where hallucinations present significant challenges. We use proprietary LLMs, both directly and through Retrieval-Augmented Generation (RAG), to offer users informed feedback on potential hallucinations via a user-friendly interface. We apply our methodology to the task of generating aspect values for video games in listings from an E-commerce marketplace, demonstrating the utility of our approach.
KW - Fact-checking
KW - Hallucinations
KW - Large Language Models
UR - https://www.scopus.com/pages/publications/105009243224
U2 - 10.1145/3701716.3717849
DO - 10.1145/3701716.3717849
M3 - Conference contribution
AN - SCOPUS:105009243224
T3 - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
SP - 2713
EP - 2716
BT - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
PB - Association for Computing Machinery, Inc
T2 - 34th ACM Web Conference, WWW Companion 2025
Y2 - 28 April 2025 through 2 May 2025
ER -