Skip to content

In this repository, we will use python's library beautifulSoup to scrape a website. The technique of taking the html file sent by the server into python and scrapping it instead of giving it to the browser and displaying it is called Web scraping.

Notifications You must be signed in to change notification settings

diwamishra21/web-Scraping-beautifulSoup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

web-Scraping-beautifulSoup

In this repository, we will use python's library beautifulSoup to scrape a website. The technique of taking the html file sent by the server into python and scrapping it instead of giving it to the browser and displaying it is called Web scraping.

What is web scraping?

  • The technique of taking the html file sent by the server into python and scrapping it instead of giving it to the browser and displaying it is called Web scraping.

Two ways of getting data from a website:

  • Using API
  • HTML web scraping using some tool like bs4

Installing modules:

Why use modules?

  • In order to use the power of python to scrape websites, we don’t have to write new code for everything. We can use existing code written by experts. Why take the hard path when the outcome is the same, when you can do it easily in some lines of code in a very short period of time?

How to install modules?

Modules are very easy to install. Open command prompt and just write these three lines one by one:

  • pip install requests
  • pip install html5lib
  • pip install bs4

In this repository, we have three files -

  • scrape.py- scraping a simple website
  • blog_scrape.py- scraping data from a blog
  • wikipedia_scrape.py- scraping data from wikipedia

About

In this repository, we will use python's library beautifulSoup to scrape a website. The technique of taking the html file sent by the server into python and scrapping it instead of giving it to the browser and displaying it is called Web scraping.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages