Skip to content

Latest commit

 

History

History
90 lines (78 loc) · 3.2 KB

CustomDatasetTutorial.md

File metadata and controls

90 lines (78 loc) · 3.2 KB

Tutorial for finetuning on a custom dataset

Step 1. Convert the custom dataset to coco format.

  • The coco format can be found in Link. Specifically, the coco annotation file (.json) includes three necessary feilds, i.e., image, annotation, categories. A toy sample (named toy_sample.json) is provided below:
{
  "categories": 
  [{
      "supercategory": "person", 
      "id": 1, 
      "name": "person"
  }], 
 "images": 
  [{
      "license": 1, 
      "file_name": "000000425226.jpg",        
      "coco_url": "http://images.cocodataset.org/val2017/000000425226.jpg", 
      "height": 640, 
      "width": 480, 
      "date_captured": 
      "2013-11-14 21:48:51", 
      "flickr_url": 
      "http://farm5.staticflickr.com/4055/4546463824_bc40e0752b_z.jpg", 
      "id": 1
  }], 
 "annotations": 
  [{
      "image_id": 1, 
      "category_id": 1, 
      "segmentation": [], 
      "area": 47803.279549999985, 
      "iscrowd": 0, 
      "bbox": [73.35, 206.02, 300.58, 372.5], 
      "id": 1
  }]
}
  • Then we can organize the custom dataset (including images and annotations) as follows:
├── Custom_coco
│   ├── annotations
│   │   └── toy_sample.json
│   ├── images
│   │   └── 000000425226.jpg

Step2. Link custom dataset into DAMO-YOLO

  • Link your dataset into datasets.
ln -s path/to/Custom_coco datasets/toy_sample
  • Add the custom dataset into damo/config/paths_catalog.py. Note, the added dataset should contain coco in their names to declare the dataset format, e.g., here we use sample_train_coco and sample_test_coco.
'sample_train_coco': {
    'img_dir': 'toy_sample/images',
    'ann_file': 'toy_sample/annotations/toy_sample.json'
},
'sample_test_coco': {
    'img_dir': 'toy_sample/images',
    'ann_file': 'toy_sample/annotations/toy_sample.json'
},

Step3. Modify the config file.

In this tutorial, we finetune on DAMO-YOLO-Tiny as example.

  • Download the DAMO-YOLO-Tiny torch model from Model Zoo
  • Add the following pretrained model path into damoyolo_tinynasL20_T.py.
self.train.finetune_path='path/to/damoyolo_tinynasL20_T.pth'
  • Modify the custom dataset in config file. Change coco_2017_train and coco_2017_test in damoyolo_tinynasL20_T.py to sample_train_coco and sample_test_coco respectively.

    self.dataset.train_ann = ('coco_2017_train', )
    self.dataset.val_ann = ('coco_2017_val', )

  • Modify the category number in config file. Change 'num_classes': 80 in damoyolo_tinynasL20_T.py to 'num_classes': 1. Because in our toy sample, there is only one category, so we set num_classes to 1.

    ZeroHead = {
    'name': 'ZeroHead',
    'num_classes': 80,

  • Modify the list of class names.

    self.dataset.class_names = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

Step4. Finetune on custom dataset

You can run the finetuning with the following code:

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py -f configs/damoyolo_tinynasL20_T.py