Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of ocr in Evaluation #95

Open
bruceisme opened this issue Apr 27, 2024 · 1 comment
Open

Use of ocr in Evaluation #95

bruceisme opened this issue Apr 27, 2024 · 1 comment

Comments

@bruceisme
Copy link

bruceisme commented Apr 27, 2024

In Appendix A's Image-text Data Collection, mention "It is important to note that the
OCR detector is utilized solely for generating enriched data and is not employed during testing
". But the textvqa scripts is using llava_textvqa_val_v051_ocr.jsonl which has ocr. So have you ever test a version without ocr in textvqa, was it worse than llava_textvqa_val_v051_ocr.jsonl ? can we understand that model could get better result with ocr input?
5a6ce66bec9d6006880fe0724c32204

@yanwei-li
Copy link
Member

Hi, the word in Appendix A means that we do not perform an extra PaddleOCR detector for evaluation. For the TextVQA, we keep the OCR Token with that in LLaVA. It should have a worse result without the original OCR tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants