Skip to content

Web Scraping products from Nordstrom search results using Scrapy

Notifications You must be signed in to change notification settings

GoTrained/Scrapy-Nordstrom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapy-Nordstrom

Web Scraping products from Nordstrom search results using Scrapy

Problem

When you search Nordstrom for a word, it gives you a URL like this: http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase

When you try to scrape it, you get only the first 12 items.

Some may resort to Selenium to solve such issues. However, here we are offering a solution using Scrapy.

Solution

If you click the next page, you will discover that this is how the URL is formed; notice the ending "top=72". http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase&page=1&top=72

The trick is to scrape only 12 per page. So for example, instead of having 11 pages, you will have 63 pages. However, the parameter "top" is not set to 72 but to 12 only.

So you should use this URL instead: http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase&page=1&top=12

There are two Scrapy spiders, one of which uses LinkExtractor.

Usage

  • Change url to reflect the keyword you are searching for
  • Change range to reflect the number of pages + 1
  • In your Terminal, navigate to the nordstrom folder
  • To run the "nord", use: scrapy crawl nord -o nord.csv
  • To run the "nordrules" spider, use: scrapy crawl nordrules -o nord-rules.csv

Scrapy Online Course

Check this Scrapy tutorial to learn much more:

About

Web Scraping products from Nordstrom search results using Scrapy

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages