Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about follow Requests #129

Open
KauSaal opened this issue Jun 28, 2023 · 0 comments
Open

Question about follow Requests #129

KauSaal opened this issue Jun 28, 2023 · 0 comments

Comments

@KauSaal
Copy link

KauSaal commented Jun 28, 2023

Hello,
I have a short question about follow Requests in Scrapy-Selenium,
In my code, I loop through all options in nested select elements. On the last select element the page is reloaded and I want to build a Scrapy Request out of a Subpage/Div ('canvas_post') to parse it:

import bs4
from scrapy import Request
from scrapy import Spider
from scrapy_selenium import SeleniumRequest
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from time import sleep

from firmware.items import FirmwareImage
from firmware.loader import FirmwareLoader

class EdimaxDESpider(Spider):
    name = "edimax_de"
    vendor = "Edimax"

    start_urls = ["https://www.edimax.com/edimax/download/download/data/edimax/de/download/"]

    def start_requests(self):
        url = "https://www.edimax.com/edimax/download/download/data/edimax/de/download/"
        yield SeleniumRequest(url=url, callback=self.parse, wait_time = 10)
    def parse(self, response):
        driver = response.request.meta['driver']
        # Find the select element by its class name
        solution_box = driver.find_element(By.CLASS_NAME, 'step1_select_cb')
        solution_select = Select(driver.find_element(By.CLASS_NAME, 'step1_select_cb'))
        # Get all option elements within the select element
        option_elements = solution_box.find_elements(By.TAG_NAME, ('option'))
        # Extract the value attribute from each option element
        options = [option_element.get_attribute('value') for option_element in option_elements]
        for option in options:
            if option != '':
                solution_select.select_by_value(option)
                sleep(1)
                # find the category box and select an option
                category_box = Select(driver.find_element(By.CLASS_NAME, 'step2_select_cb'))
                category_element = driver.find_element(By.CLASS_NAME, 'step2_select_cb')
                # Get all option elements within the category element
                option_elements = category_element.find_elements(By.TAG_NAME, ('option'))
                # Extract the value attribute from each option element
                options = [option_element.get_attribute('value') for option_element in option_elements]
                # loop through option
                for option in options:
                    if option != "":
                        category_box.select_by_value(option)
                        sleep(1)
                        # find the modelNo box and select an option
                        modelNo_box = Select(driver.find_element(By.CLASS_NAME, 'step3_select_cb'))
                        modelNo_element = driver.find_element(By.CLASS_NAME, 'step3_select_cb')
                        # Get all option elements within the modelNo element
                        option_elements = modelNo_element.find_elements(By.TAG_NAME, ('option'))
                        # Extract the value attribute from each option element
                        options = [option_element.get_attribute('value') for option_element in option_elements]
                        # loop through options
                        for option in options:
                            if option != '':
                                modelNo_box.select_by_value(option)
                                sleep(5)
                                html_from_page = driver.page_source
                                soup = bs4.BeautifulSoup(html_from_page, 'html.parser')
                                yield Request(soup,
                                                      callback=self.parse_product)
    def parse_product(self, response):
        print("IM HERE")
        canvas = response.css("#side2 > div.canvas_post").getall()
        print("ELEMENT CANVAS POST")`

The print statements in parse_product are never printed and I also dont get scrapy request log messages like when I'm using scrapy without selenium
Hope someone can give me an hint and thanks in advance
KauSaal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant