Skip to content

Web scraping, also known as web data extraction or web harvesting, is the process of extracting data from websites. It involves the use of automated software tools to retrieve and analyze information from web pages, including HTML pages and other web resources such as images and documents. The extracted data can be used for a variety of purposes.

Notifications You must be signed in to change notification settings

Cargand0/basic-Webscrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Basic Web Scraping Project

This repository contains a basic implementation of web scraping using Python, with the goal of extracting data from websites and storing it in a structured format. The project is designed to be accessible to beginners, while also providing useful insights and code snippets for more experienced developers.

Features

  • No csv library: Eventhough it is easier to use import library, I just try to not use it.

  • Easy-to-follow code: The code is written in Python and is designed to be easy to read and understand, even for beginners who are new to the language.

  • Well-organized structure: The project is structured in a way that is easy to navigate and follow, with separate directories for input files, output files, and Python scripts.

  • Customizable scraping settings: The project includes a settings file that allows users to customize various parameters, such as the website URL, the HTML tags to scrape, and the output format.

  • Useful data output: The data extracted from the website is stored in a structured format, such as a CSV file, making it easy to analyze and manipulate.

Website Used

The website used for this project is Elrah Exclusive e-commerce website: https://elrahexclusive.my/product-category/baju-melayu/magnificent-4-0/?products-per-page=all

About

Web scraping, also known as web data extraction or web harvesting, is the process of extracting data from websites. It involves the use of automated software tools to retrieve and analyze information from web pages, including HTML pages and other web resources such as images and documents. The extracted data can be used for a variety of purposes.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages