Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

Open
whysirier opened this issue May 8, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@whysirier
Copy link

请提出你的问题

1)对第一张图片直接用paddleocr识别的坐标为[[2530.0, 105.0], [2654.0, 105.0], [2654.0, 171.0], [2530.0, 171.0]],可以看出右上角的文字能正常框出,即
new_image

2)但根据样例得到的demo_ocr_res.json文件里面发现的坐标确是这样:"document_bbox": [[2530, 2592, 105, 175], [2530, 2592, 105, 175], [2592, 2654, 105, 175], [284, 463, 294, 386], [463, 552, 294, 386], ...., (内容已省略)
企业微信截图_20240508105416

不太明白document_bbox的数值指的是文字的坐标么,但直接将坐标在图片显示出来又不是文字方框,请问这些数值的具体含义是什么

@whysirier whysirier added the question Further information is requested label May 8, 2024
@w5688414
Copy link
Contributor

w5688414 commented May 8, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants