Web Scraping with Python

一边学习《用Python写网络爬虫》一书，一边对于书中因为事例网站的变化而导致的bug进行修复。
This repository contains fixed source code of examples from the book Web Scraping with Python.

Author:Siyao Chen

E-mail:siyao.chen92@gmail.com

Bug_Fixing

The first bug comes from the websites update.The url input of the 'link_crawler' should be as follows. link_crawler('http://example.webscraping.com/places', '/places/default/(index|view)', delay=0, num_retries=1,user_agent='BadCrawler')

Original_Readme

This repository contains source code of examples from the book Web Scraping with Python, published by Packt Publishing. Examples have been tested with Python 2.7 and depend on:

BeautifulSoup (Ch 2)
lxml (Ch 2-9)
pymongo (Ch 3-5, 9)
PyQt / PySide (Ch 5)
ghost (Ch 5)
Selenium WebDriver (Ch 5, 9)
mechanize (Ch 6)
PIL / Pillow (Ch 7)
pytesseract (Ch 7)
scrapy (Ch 8)
portia (Ch 8)
scrapely (Ch 8)

This examples will break in future as websites change and dependencies are updated, so bug reports and patches are welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
1		1
2		2
README.md		README.md
demo.py		demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

1

1

2

2

README.md

README.md

demo.py

demo.py

Repository files navigation

Web Scraping with Python

Author:Siyao Chen

E-mail:siyao.chen92@gmail.com

目录

Bug_Fixing

Original_Readme

About

Releases

Packages

Languages

Clark934/wswp

Folders and files

Latest commit

History

Repository files navigation

Web Scraping with Python

Author:Siyao Chen

E-mail:siyao.chen92@gmail.com

目录

Bug_Fixing

Original_Readme

About

Topics

Resources

Stars

Watchers

Forks

Languages