Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clevr数据集的使用 #169

Open
LiJiaqi96 opened this issue Apr 29, 2024 · 15 comments
Open

clevr数据集的使用 #169

LiJiaqi96 opened this issue Apr 29, 2024 · 15 comments

Comments

@LiJiaqi96
Copy link

您好,请问image_reasoning - clevr数据集具体是哪个?我按文章中的引用找到了https://cs.stanford.edu/people/jcjohns/clevr/,下载了[CLEVR v1.0 (18 GB)],解压后发现图片内容和json中的格式不对应。

@Andy1621
Copy link
Collaborator

您好,图像数据都是用的M3IT中提供的。

@LiJiaqi96
Copy link
Author

谢谢,看了下M3IT,里面json中image是一长串字符,如何将它们对应到VideoChat2给出的“train/39065.jpg”这样的形式?

@Andy1621
Copy link
Collaborator

我们是根据M3IT给的标注,根据序列idx生成的idx.jpg

@LiJiaqi96
Copy link
Author

没太明白...想请教下如何将M3IT中的"image_str"和CLEVR数据集中具体的image名称对应起来呢?

@Andy1621
Copy link
Collaborator

Andy1621 commented Apr 30, 2024

image_str是base64字符串,可以直接读取。我们是转成了RGB图像,image名称是根据for循环遍历M3IT中的数据,对应的idx生成的,不是根据原始CLEVR数据得到的。

@LiJiaqi96
Copy link
Author

明白了!您的idx对应的是使用datasets加载数据后遍历的idx对吧?

@Andy1621
Copy link
Collaborator

对滴

@LiJiaqi96
Copy link
Author

好的,感谢您的解答

@LiJiaqi96
Copy link
Author

在输出的时候还是遇到了一些问题,还得请教下您。下面是我的code:

import os
import base64
import datasets

save_dir = "clevr_M3IT"
ds = datasets.load_dataset("./datasets/M3IT/", "clevr", split="train", streaming=True)
cur_dir = os.path.join(save_dir, "train")
i = 0
for d in ds:
    image = base64.decodebytes(d["image_base64_str"][0].encode())
    with open(cur_dir+f"/{i}.jpg", "wb") as fh:
        fh.write(image)
    i += 1

在输出了一些图片后,我手动看了下部分图片的内容,发现它们并不能和您在HF发布的OpenGVLab/VideoChat2-IT中的QA匹配,比如train/90.jpg,
90
[ { "a": "The answer is cylinder.", "i": "Analyze the given image and respond to the associated question with a correct answer.", "q": "There is a green object that is behind the small rubber cylinder that is to the left of the matte cylinder to the right of the gray thing; what is its shape?" } ]

@Andy1621
Copy link
Collaborator

奇怪,我们这边不是这个图嘞,我让当时处理的小伙伴康康

@LiJiaqi96
Copy link
Author

好的,感谢~

@Andy1621
Copy link
Collaborator

你好,找小伙伴check了一下,对于某些数据集(如CLEVR),M3IT里给的meta信息里有image_index,对于其他数据集,通过for循环的index得到

@LiJiaqi96
Copy link
Author

原来如此,不过好像在CLEVR的metadata里没有看到image_index,代码是:

ds = datasets.load_dataset("./datasets/M3IT/", "clevr", split="train", streaming=True)
ds.info

@yinanhe
Copy link
Member

yinanhe commented May 6, 2024

原来如此,不过好像在CLEVR的metadata里没有看到image_index,代码是:

ds = datasets.load_dataset("./datasets/M3IT/", "clevr", split="train", streaming=True)
ds.info

抱歉,看到这个问题,我们是通过直接下载huggingface dataset repo里的jsonl文件读取的
image

@LiJiaqi96
Copy link
Author

可以了!请问是使用huggingface dataset repo里的train.jsonl对吧(而不是train_2023-10-07.jsonl)
https://huggingface.co/datasets/MMInstruction/M3IT/tree/main/data/reasoning/clevr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants