Skip to content

Fine grained image classification with a Vision Transformer

License

Notifications You must be signed in to change notification settings

sn2727/flower-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision Transformer for Flower Classification

The dataset used in this project is a custom subset of the Oxford 102 Flowers dataset. load_data.ipynb loads the data using Tensorflow, size and split of the dataset can be defined in the notebook, images are then saved locally to be used in flower_classification.ipynb with torch.

The notebook flower_classification first preprocesses and explores the data. A Visual Transformer pretrained on ImageNet-21k (Google's ViT-small, 22M parameters) is finetuned on the dataset to achieve an accuracy of around 95% on the test dataset. Hyperparameters have been tuned based loosely on this paper.

SGD is used as an optimizer with a momentum of 0.9 and begins with a learning rate of 0.001. The learning rate is then decayed with a cosine schedule incorporating warm starts.

About

Fine grained image classification with a Vision Transformer

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published