Skip to content

Option to exclude certain resource types #1814

Answered by LeMoussel
mbledkowski asked this question in Q&A
Discussion options

You must be logged in to vote

Or Skipping navigations for certain requests.
Or you can do like this:

import {
  PlaywrightCrawler,  // https://crawlee.dev/docs/examples/playwright-crawler
  log
} from 'crawlee';


// https://playwright.dev/docs/api/class-request#request-resource-type
const RESOURCE_EXCLUSTIONS = ['image', 'stylesheet', 'media', 'font', 'other'];
/* More ressource type
For a new site I try to block everything in BLOCKED_IMG_CSS_JS.
In case the site does not work (pages not rendered properly) I try the BLOCKED_IMG_CSS.
If it is still not work - BLOCKED_IMG and than - no block at all.

const BLOCKED_IMG = ['image', 'imageset', 'object', 'object_subrequest', 'ping', 'web_manifest', 'xslt',  'media', 'font', 

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@mbledkowski
Comment options

Answer selected by mbledkowski
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
feature Issues that represent new features or improvements to existing features.
3 participants
Converted from issue

This discussion was converted from issue #1812 on March 06, 2023 08:13.