Skip to content

A Python (Flask), PostgreSQL, and AWS based web application that uses machine learning to recommend movies that you'll love or movies that you'll hate.

Notifications You must be signed in to change notification settings

theodoremoreland/MovieIon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Ion

Movie Ion (originally named Movie Matchmaker) was a group project for Washington University's Data Analytics Boot Camp (2019). For this project, we created a web application that uses machine learning to recommend movies in terms of what the user will likely enjoy or movies the user is least likely to enjoy.

Table of Contents

The team (by GitHub username)

Overview:

At the Home screen, select three of your favorite movies from the dropdown menu (select a movie a second time to remove the selection). Once your movies are selected, click the green "Submit" button to process your selections.

By default, the application will recommend movies that you are likely to enjoy. Alternatively, by toggling the slider above the movie posters (on the Home screen), the application will recommend movies that you are likely to dislike.

Upon receiving your recommendations, you can click on the movie posters to view information about the movie, including the movie's title, release date, a brief synopsis, and more. This feature is also available via clicking posters selected on the Home screen.

When viewing a movie's information, you can click the "Add to Watchlist" button to add the movie to your watchlist. You must be logged in as user1, user2, user3, user4, user5, or user6 to add movies to your watchlist. There is no password for any of these accounts as they are for demonstration purposes only.

Note: Watchlist and user accounts were implemented to exercise and demonstrate authentication, user sessions, and database functionality. It serves no practical purpose beyond that.

Technologies used:

  • Web Scraping (Python, Splinter)
  • Data Wrangling (Pandas, SQL)
  • Machine Learning (sklearn, scipy, and joblib)
  • Storage (PostgreSQL, S3 Bucket)
  • Backend (Python, Flask)
  • Frontend (JavaScript, Bootstrap 4, HTML5/CSS3, jQuery, ajax)
  • Containerization (Docker)
  • Web Host (AWS)

How to run locally

Whether you are running the app directly on a Windows OS or indirectly via Docker, there are a few things you need to do in order to setup the application:

  • You need your own PostgreSQL database instance.

  • You need to create a file in application/modules/ called config.py mimicking the template provided in application/modules/config.py.example wherein the empty strings are replaced with values relating to a connection to your PostgreSQL database instance.

  • You need to execute the SQL code found in the resources/ folder to create the tables and insert the data needed to run the app, the order of which does not matter.

  • Download .joblib files from here (link available soon) and place in application/models/ folder (folder must be created).

  • If you are trying to run this application directly on a Windows OS, you will need to install Python 3.8.

  • Otherwise, you will need to install Docker so you can run the application through Docker.

Run on Windows

Assumes you are using a modern Windows client OS such as Windows 11 or Windows 10 and that Python 3.8 is installed.

Open terminal at root of this project then move into application/ directory:

cd application/

Create venv folder in application folder using Python 3.8:

python3.8 -m venv venv

Activate venv:

source venv/Scripts/activate

Install python packages to venv:

pip install -r requirements.txt

Start application:

python application.py

Run on Docker

Firstly, confirm that Docker is installed and running. Next confirm that no other application is using port 5000 as port 5000 is needed for the Flask server. If you need to run Flask on an alternative port, you can modify the last line in the application/application.py file.

Open terminal at root of this project then move into docker/ directory:

cd docker/

Build Docker image and start Docker container:

docker compose up --build

Visit: http://localhost:5000 to use the application.

Known bugs

  • Most movies that start with words such as "A" or "The" erroneously have the word at the end of the movie title preceded by a comma (e.g. Ref, The (1994), Toy, The (1982) or Walk in the Clouds, A (1995)). Unfortunately, a fix isn't as simple as formatting the data in the database or web server. The issue stems from the source data and would most likely have to be transformed prior to being added to the model.
  • Some (relatively few) movies don't have posters such as Jurassic Park (1993), Toy Story (1995), and Monsters, Inc. (2001).
  • Something went wrong when processing the original Toy Story! It is unfortunately not supported!

Note to developers:

If intending to run this codebase locally, here are a few things to note.

  • joblib models can only be used by the same version of joblib, sklearn, scikit-learn that created them.
  • Certain versions of joblib, sklearn, scikit-learn are not compatible with newer versions of Python.
  • Certain versions of joblib, sklearn, scikit-learn are not compatible with each other nor this codebase.
  • The requirements.txt file contains the last versions of joblib, sklearn, scikit-learn that are compatible with each other and this codebase.
  • As of this writing, Python 3.9 and above are not compatible with versions listed in requirements.txt and thus Python 3.8 is being used.
  • The scripts/create_ML_models.py script can be used to create new models, however much of the data needed for model creation have to first be web scraped and created via files in the notebooks/ folder which have been deprecated for years.
  • Expect to need between 2.5GB - 3GB of RAM without optimizations.

Screenshots:

Desktop

Home Screen

Home Screen (After Toggle):

Searching for Year One

After selecting Year One

Searching for Pacific Rim

After selecting Pacific Rim and 500 Days of Summer

After clicking submit button

Results (View 1)

Results (View 2)

Results (View 3)

After selecting a movie result (Example 1)

After selecting a movie result (Example 2)

After selecting a movie result (Example 3)

User log in (Demo version)

User profile (Demo version)

Movie selections (After Toggle)

Movie submit (After Toggle)

Results (View 1) (After Toggle)

Results (View 2) (After Toggle)

Results (View 3) (After Toggle)

After selecting a movie result (After Toggle)

After adding movie to watchlist

User Profile (After adding three movies to watchlist)

User Profile (After removing two movies from watchlist)