Skip to content

jaygshah/CSE-573-Final-Project-Document-Clustering-and-Visualization

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document Clustering and Visualization

Github repo for CSE 573 project.

Contributors and Team Members: Kunal Suthar, Jay Shah, Leroy Vargis, Abhay Mathur, Vatsal Sodha.

Selenium setup instructions:

  1. Install selenium python package:
    pip install selenium
  2. Install selenium browser driver: This project uses the Firefox driver Install instruction found here

Dependencies setup instructions:

  1. Install all dependencies:
    bash demo.sh install-dep

Project run instructions:

To run the project, use demo.sh file with the following arguments:

  1. Scrape data and save it to a pickle file:
    bash demo.sh scrape
  2. Create the LDA model:
        bash demo.sh create-lda
  1. Apply tsne and generate the dependencies:
        bash demo.sh apply-tsne
  1. Apply pca and generate the dependencies:
        bash demo.sh apply-pca
  1. Generate 3D visualization:
        bash demo.sh visualize-3d
  1. Run the above steps/commands with tsne sequentially from scratch
        bash demo.sh run-project-tsne
  1. Run the above steps/commands with pca sequentially from scratch
        bash demo.sh run-project-pca

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.7%
  • Other 0.3%