Skip to content

GigaHertzLegacy-SpiderX/Py_Web_Scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraper

Web scraping is the process of using bots to extract content and data from a website. Unlike screen scraping, which only copies pixels displayed onscreen, web scraping extracts underlying HTML code and, with it, data stored in a database. The scraper can then replicate entire website content elsewhere.

Features

  • Get The Website whole Soruce Code in Pretty Look
  • lookup 20 pre-defined Tag
  • Allowed Custom Tag

You can add more function if you want, everything is avaialble there

Installation

Without using cmd/ternimal - - -

1. Download the Zip File From Code
2. Extract it
3. Go to the folder
3. open cmd inside that folder
4. python main.py 

Using cmd /Terminal - - -

 git clone https://github.com/GigaHertzLegacy-SpiderX/Py_Web_Scrape.git
 cd Py_Web_Scrape
 pip install -r requirements.txt 

Plugins

Plugin Modules
Beautifulsoup4 Version == 4.11.1
Request Version == 2.26.0

Development

Want to contribute? Great!

Simply fork the repo and start updating, It's open source ;)

© Gigahertz Legacy -SpiderX