[NLPCC 2024] Shared Task 10: Regulating Large Language Models
-
Updated
May 24, 2024
[NLPCC 2024] Shared Task 10: Regulating Large Language Models
up-to-date and curated list of awesome state-of-the-art LVLMs hallucinations research work, papers & resources
Loki: Open-source solution designed to automate the process of verifying factuality
Controlled HALlucination-Evaluation (CHALE) Question-Answering Dataset
[ACL 2024] Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
openai assistant using code interpreter
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.
[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.
[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
QuantHaLL: Quantifying Hallucination in machine translation for Low-resource Languages
Verify outputs generated by LLMs backed with real time data
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.
[NAACL24] Official Implementation of Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Add a description, image, and links to the hallucination topic page so that developers can more easily learn about it.
To associate your repository with the hallucination topic, visit your repo's landing page and select "manage topics."