A spyder bot to scrape timesjob.com for job listings and save in a db using scrapy
run the following command to install the packages: pip install -r requirements.txt
The spyder scrapes these following fields:
Field | Datatype | Description |
---|---|---|
jobType | string | Type of job |
moreDetails | string | href to get more details about the job listing |
companyName | string | Name of the company |
reqExp | string | Required experience |
location | string | Location of office |
compensation | string | Compensation for the job |
jobDescription | string | Description of the job |
skillSet | string | Skill set required for the job |
postedTime | string | When was the job listed |
isWFHAvailable | string | Is Work from home option available |
Ex Cmd: scrapy crawl timesjob -a keywords="Data science" -a location="Mumbai" -a workexp="1" -a maxpages="100"
The scraped details is stored in a sqlite database named JobListing.db . One can use sqliteonline for quick viewing of the database.
ex cmd: SELECT * FROM job_listing_tb;
To view all the entries