-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support DSDL Dataset #1503
base: dev
Are you sure you want to change the base?
Conversation
wufan-tb
commented
Apr 19, 2023
- support dsdl cls dataset
- add dsdl dataset citest
- validated accuracy on cifar10 and imgenet1k
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## dev #1503 +/- ##
==========================================
+ Coverage 84.37% 84.83% +0.46%
==========================================
Files 142 231 +89
Lines 9925 17589 +7664
Branches 1621 2764 +1143
==========================================
+ Hits 8374 14922 +6548
- Misses 1277 2149 +872
- Partials 274 518 +244
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
where is 'dsdl/set-train/train.yaml' and 'dsdl/set-val/val.yaml'? |
54a6ff7
to
a1cb1d6
Compare
| Datasets | Model | Top-1 Acc (%) | Config | | ||
| :---------: | :-------------------------------------------------------------------------------------------------------------: | :-----------: | :-----------------------: | | ||
| cifar10 | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth) | 94.83 | [config](./cifar10.py) | | ||
| ImageNet-1k | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth) | 69.84 | [config](./imagenet1k.py) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid that the ImageNet category mapping used by DSDL is not compatible with mmpretrain, and the result cannot be reproduced. The result can be reproduced by using the following code for mapping according to ILSVRC2012_mapping.txt.
def load_data_list(self):
# ...
# For ImageNet
id2name = {}
folders = []
with open('ILSVRC2012_mapping.txt', 'r') as f:
for line in f.readlines()[:1000]:
line = line[:-1]
cid, name = line.split()
id2name[int(cid)] = name
folders.append(name)
folders.sort()
# ...
label_index = data['Label'][0].index_in_domain() - 1
name = id2name[label_index + 1]
label_index = folders.index(name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是这样的,因为dsdl在转换的时候,顺序确实和原始的ImageNet不一致,所以用之前预训练好的模型测试,结果会不一致,但是表格里的数据是我在load代码里进行了顺序对齐之后跑出来的结果,是可以对齐的,但是这个顺序对齐的部分是只针对ImageNet的,所以在合并的时候我把这部分去掉了;
实际上如果用DSDLDataset重新训练一个模型,就不需要顺序对齐了,精度也是可以对齐的,所以代码里是不需要这个顺序对齐的工作的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
要不为了减少歧义,把这两行表格删掉吧