Add two visual instruction generation methods (with visual instruction datasets) and a visual hallucination benchmark. #120

sunggukcha · 2024-02-21T08:14:02Z

Thanks to your efforts.

Updated two papers.

Visually Dehallucinative Instruction Generation: Know What You Don't Know, https://arxiv.org/abs/2402.09717
It proposes (1) I know visual hallucination, (2) I know hallucination benchmark and (3) solution, I dont know visual instruction dataset (as well as the generation method).
Visually Dehallucinative Instruction Generation, https://arxiv.org/abs/2402.08348
It proposes (1) a dehallucinative visual instruction generation method that limits generating contents more than the provided fact and (2) the resultant, dehallucinative visual instruction.

Regards,
Sungguk Cha

xjtupanda · 2024-02-26T09:26:51Z

These works have been added to our repo.
Please consider citing our works:

@article{yin2023survey,
  title={A Survey on Multimodal Large Language Models},
  author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Li, Ke and Sun, Xing and Xu, Tong and Chen, Enhong},
  journal={arXiv preprint arXiv:2306.13549},
  year={2023}
}

@article{fu2023mme,
  title={MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models},
  author={Fu, Chaoyou and Chen, Peixian and Shen, Yunhang and Qin, Yulei and Zhang, Mengdan and Lin, Xu and Yang, Jinrui and Zheng, Xiawu and Li, Ke and Sun, Xing and Wu, Yunsheng and Ji, Rongrong},
  journal={arXiv preprint arXiv:2306.13394},
  year={2023}
}

@article{yin2023woodpecker,
  title={Woodpecker: Hallucination Correction for Multimodal Large Language Models},
  author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Xu, Tong and Wang, Hao and Sui, Dianbo and Shen, Yunhang and Li, Ke and Sun, Xing and Chen, Enhong},
  journal={arXiv preprint arXiv:2310.16045},
  year={2023}
}

@article{fu2023gemini,
  title={A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise},
  author={Fu, Chaoyou and Zhang, Renrui and Wang, Zihan and Huang, Yubo and Zhang, Zhengye and Qiu, Longtian and Ye, Gaoxiang and Shen, Yunhang and Zhang Mengdan and Chen, Peixian and Zhao, Sirui and Lin, Shaohui and Jiang, Deqiang and Yin, Di and Gao, Peng and Li, Ke and Li, Hongsheng and Sun, Xing},
  journal={arXiv preprint arXiv:2312.12436},
  year={2023}
}

Update README.md

af1c61c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two visual instruction generation methods (with visual instruction datasets) and a visual hallucination benchmark. #120

Add two visual instruction generation methods (with visual instruction datasets) and a visual hallucination benchmark. #120

sunggukcha commented Feb 21, 2024

xjtupanda commented Feb 26, 2024

Add two visual instruction generation methods (with visual instruction datasets) and a visual hallucination benchmark. #120

Are you sure you want to change the base?

Add two visual instruction generation methods (with visual instruction datasets) and a visual hallucination benchmark. #120

Conversation

sunggukcha commented Feb 21, 2024

xjtupanda commented Feb 26, 2024