Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About od model to generate coco_caption features. I cannot reproduce your feature results. #181

Open
Jason-fan20 opened this issue Jan 7, 2022 · 3 comments

Comments

@Jason-fan20
Copy link

Jason-fan20 commented Jan 7, 2022

I tried to reproduce the results for VinVL+SCST on NoCaps, but my result was off by a visible margin.

I generate coco features by od model with pre-trained models with vinvl_vg_x152c4.pth, when I simply check the total types of generated tags train.label.tsv, it has 1319 types. After that, I limit the total types to open images 500 labels, my train.label.tsv has a total of 400~ types and my result was also off by a visible margin. I can only get 12.1 SPICE scores after CC(Cross entropy)+CIDER OPTIM. The config file is val_vinvl_x152c4.yaml

For the given coco_caption training set, train.label.tsv has a total of 498 types. I can normally reproduce your result 12.8 SPICE.

May I ask, which model should I use to reproduce your od feature results, could you plz you give me a link to this od model (To reproduce your feature results and similarly generate them)?

If it's not available, Or I should try an od model on the visual genome and open images by myself?

Many thanks !

(´ー∀ー)(´ー∀ー)(´ー∀ー`)

@xiaoweihu
Copy link

Hi, the od features are generated from the model you linked vinvl_vg_x152c4.pth.
If you need the features on COCO or nocaps, you can download the pre-extracted features and labels at https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md#datasets

@pzzhang
Copy link
Contributor

pzzhang commented Jan 12, 2022

@309018451 For the image tags, the image tags used in both NoCaps training and testing are generated from an OD model pretrained on OpenImages dataset, not from the model vinvl_vg_x152c4.pth.

We cannot release the OD model pretrained on OpenImages dataset, but you should be able to train your own OpenImages OD model or use some public OpenImages OD model.

@Jason-fan20
Copy link
Author

@pzzhang @xiaoweihu
Thank you so much!
I'll try and make it work in my case.
Again, thank you a lot 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants