Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow all URLs on page with dynamic pagination #130

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

albert-nagy
Copy link

Hi!
Thank you for the great work!
My little modification enables following multiple links on a page when using dynamic pagination, not only the first one. If there are multiple links matching the XPath expression given, all of them will be followed.
This allows to scrape sites with the following
structure as well:

page with "category" links > several main pages with article list (> detail pages)

...matching the XPath expression, not only the first one.
This allows to scrape sites with the following
structure as well:

page with category links >
category pages with article list (MP)
(> detail pages)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant