Replies: 2 comments
-
We likely will not be adding extra custom properties to the |
Beta Was this translation helpful? Give feedback.
0 replies
-
Generic scraping patterns are hard to get right for a vast array of different pages. Crawlee server as a building block where you can add custom parsers on top. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Which package is the feature request for? If unsure which one to select, leave blank
No response
Feature
Parsing dom is a tedious task for crawlers, and I want to add a rules api to page
example
const data: any = await page.content().rules(
'title': ['h1', 'text'],
'src': ['img', 'src', map] // map is this data callback
)
and more same api
in a word, i hope to parse the dom into an object object or an array object through some fixed rules
yeah yeah,thank for all development!!!!
Motivation
Parsing dom is a tedious task for crawlers, and I want to add a rules api to page
example
const data: any = await page.content().rules(
'title': ['h1', 'text'],
'src': ['img', 'src', map] // map is this data callback
)
and more same api
in a word, i hope to parse the dom into an object object or an array object through some fixed rules
yeah yeah,thank for all development!!!!
Ideal solution or implementation, and any additional constraints
Parsing dom is a tedious task for crawlers, and I want to add a rules api to page
example
const data: any = await page.content().rules(
'title': ['h1', 'text'],
'src': ['img', 'src', map] // map is this data callback
)
and more same api
in a word, i hope to parse the dom into an object object or an array object through some fixed rules
yeah yeah,thank for all development!!!!
Alternative solutions or implementations
No response
Other context
No response
Beta Was this translation helpful? Give feedback.
All reactions