[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

whysirier · 2024-05-08T02:55:40Z

请提出你的问题

1）对第一张图片直接用paddleocr识别的坐标为[[2530.0, 105.0], [2654.0, 105.0], [2654.0, 171.0], [2530.0, 171.0]]，可以看出右上角的文字能正常框出，即

2）但根据样例得到的demo_ocr_res.json文件里面发现的坐标确是这样："document_bbox": [[2530, 2592, 105, 175], [2530, 2592, 105, 175], [2592, 2654, 105, 175], [284, 463, 294, 386], [463, 552, 294, 386], ...., （内容已省略）

不太明白document_bbox的数值指的是文字的坐标么，但直接将坐标在图片显示出来又不是文字方框，请问这些数值的具体含义是什么

w5688414 · 2024-05-08T06:18:57Z

请参考aistudio教程：

https://aistudio.baidu.com/projectdetail/4049663?channelType=0&channel=0

whysirier added the question Further information is requested label May 8, 2024

paddle-bot bot assigned KB-Ding May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

whysirier commented May 8, 2024

w5688414 commented May 8, 2024

[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

[Question]: 汽车说明书跨模态智能问答例子的json格式是不是有问题 #8385

Comments

whysirier commented May 8, 2024

请提出你的问题

w5688414 commented May 8, 2024