Skip to content

Actors Image Classification Machine Learning Model using Sklearn Libraries with Aws EC2 model deployment

Notifications You must be signed in to change notification settings

RishabhkmrRK/Actors-Image-Classification

Repository files navigation

Important Links:

  1. Dataset - Download from Google Images
  2. Model Training - Notebook
  3. Model deployment - AWS EC2 Instance
    • Note: Due to limited dataset, model can only classify clear single person image with both eyes visible.

Languages and Tools used:

jupyter opencv Python sklearn flask nginx html css javascript aws github pycharm visualstudiocode


Table of contents

Overview

In this project, I have trained and deployed a machine learning model which classifies the images of Tv show "The Big Bang Theory" actors: Howard, Leonard, Raj and Sheldon. Google images was used to Gather data, Sklearn libraries to train model and Amazon Web Services EC2 Instance to deploy model.

This project comes under Supervised Machine Learning Classification problem having 4 labels.

Workflow

ML model development process

  1. Data Preparation

    • Data Gathering/Web Scrapping: To gather data, Fatkun google chrome extension was used to bulk download images of respective actors from google. [Raw Dataset]
    • Data cleaning: After downloading Raw dataset, python script was used to clean which reject images those do not have clear image and both eyes are not visible using opencv library. As opencv can only find images with proper face and two eyes visib;e, manually deletion of non required actor iamges from respective actors directory in dataset had be performed. [Cleaning Code] [Cleaned Dataset]
  2. Feature Engineering [Notebook]

    • Feature Extraction: In this step, image was sized to 32x32 pixels and then pyWavelet was used to extract feature from image. After extracting feature from image, extracted image was vertical stack with the original one making the size of image 4096 (32x32x3 + 32x32).
    • Data Scaling: Scaling of the data was performed to makes it easy for a model to learn and understand the problem. Each data point in features was in range 0-255 (as each pixel is 8 bytes - 255 bits) and was scaled to 0-1 range using sklearn's MinMaxScaler.
  3. Model Building [Notebook]

    • Model Training: To train model having 4096 features and being a classifcation problem, Sklearn library modules Logistic Regression, Random Forest and Support Vector Machine were used.
    • Evaluation: To perform evaluation, sklearn GridSearchCV module was used to hypertune the parameter of each classification module.
    • Model Selection: The model with high accuracy was selected which was Support Vector Machine with parameters: {'svc__C': 1, 'svc__kernel': 'linear'}.
  4. Model Deployment

    • User Interface/client: This is web-page built using HTML, CSS and JavaScript which is used for interaction with user. Drop box is present here in which user uploads the needs to be classify image and get the prediction. Here Nginx used for web-hosting and reverse proxy. [UI]
    • Flask: Flask is a micro web framework written in Python. Server side programing was done here which recieves image from the client(user interface) and send the prediction back to client using saved model prection pickle file. [Server]
    • AWS Ec2 Instance: Amazon Web Services Ec2 Instance was used to deploy the model containing all the server side and client side files. [Ec2 Instance]

Conclusion

Model was successfully trained and deployed with 84% accuracy but due to limited dataset, model can only classify clear single person image with both eyes visible.


Future Target

Next Step is to implement the same Classification Model using Deep-Learning (Neural Networks) with high accuracy and with much bigger dataset in hand.

About

Actors Image Classification Machine Learning Model using Sklearn Libraries with Aws EC2 model deployment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published