Skip to content
This repository has been archived by the owner on Jun 1, 2022. It is now read-only.

thomaspaulin/snc-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

snc-scraper

Build Status

Scraping the Auckland SNC Hockey website one symbol at a time. (http://www.aucklandsnchockey.com)

Foreword

At the time of writing this all the scraping is done in Beautiful Soup 4. There are plans to move it to use Scrapy later down the lines.

Setup

This repository uses Scrapy and Python3. To get set up do the following:

  1. Install python3
  2. Set up a virtual environment virtualenv venv or virtualenv -p python3 venv
  3. Activate your virtual environment with source venv/bin/activate
  4. cd into this repo
  5. Run pip install -r requirements.txt
  6. Write awesome code

Running

python src/main.py (builds not yet implemented)

Scraping in the REPL

from bs4 import BeautifulSoup

import requests

r = requests.get([url goes here])

soup = BeautifulSoup(r.text, 'lxml')

Releases

No releases published

Packages

No packages published

Languages