-
Which package is this bug report for? If unsure which one to select, leave blank@crawlee/core Issue descriptionI dynamically generated a batch of URLs for the puppeteer crawler, but the crawler ended the program immediately, showing that the number of completed task was 1. This was very unusual, and it took me a while to figure out that the default uniqueKey generator might have generated the same key for this batch of URLs.
this.uniqueKey = uniqueKey || this._computeUniqueKey({ url, method, payload, keepUrlFragment, useExtendedUniqueKey }); I found the location of the generated program, but I haven't had a chance to debug where the problem is Code sampleconst links: RequestOptions[] = [];
pageHrefList.forEach((href) => {
const link = new URL(href, loadedUrl);
if (link.hostname === hostname) {
if (!findCrawledURL(encodeEntryURL, link.href)) {
links.push({
url: link.href,
userData: {
encodeEntryURL,
},
uniqueKey: link.href,
});
}
}
});
await crawler.addRequests(links); I solved this problem in my project by passing in a uniqueKey, and felt the need to raise an issue at the time Package version3.3.3 Node.js versionv19.8.1 Operating systemmacos Apify platform
I have tested this on the
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This is on purpose, you can use https://crawlee.dev/api/core/interface/RequestOptions#keepUrlFragment |
Beta Was this translation helpful? Give feedback.
This is on purpose, you can use
Request.keepUrlFragment
to respect them.https://crawlee.dev/api/core/interface/RequestOptions#keepUrlFragment