Skip to content

lucasxlu/DataHouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataHouse

Data Mining Workspace

Introduction

This repository is designed for data scraping, data mining and data visualization in house price, job interviewing and SNS data mining. We collect data with scrapy and requests, data pre-processing and machine learning with scikit-learn and pandas. Data storing with MongoDB.

There are several modules belongs to different application.

Prerequisite

Python version >= 3.5
requests
scrapy
pandas
scikit-learn
pymongo
TensorFlow >= 1.6
PyTorch >= 0.3.1

  • Installation

    sudo pip3 install -r requirements.txt

  • Start MongoDB Service

    sudo service mongod start

Report

Note

  • This repository can be only used for research and non-commercial applications.