You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to override the source when using the new document store feature (and/or document loaders in general).
Take, for example, the JSON lines loader.
It would be great if there was a way to use a field from the JSON data to set the source.
I tried this…
But it just comes through as a hardcoded string…
If this isn't possible, I wonder what the best alternative is.
In this specific use case I'm basically trying to get a load of HTML pages, scraped from a site which requires authentication, uploaded as documents with the source set to their URL.
I figured I could save the HTML to a JSON file and upload it that way, but would need to set the source.
I believe I can't use Cheerio etc. because of the need to log in to the web site before scraping it (it's my own site).
The text was updated successfully, but these errors were encountered:
Is there a way to override the source when using the new document store feature (and/or document loaders in general).
Take, for example, the JSON lines loader.
It would be great if there was a way to use a field from the JSON data to set the source.
I tried this…
But it just comes through as a hardcoded string…
If this isn't possible, I wonder what the best alternative is.
In this specific use case I'm basically trying to get a load of HTML pages, scraped from a site which requires authentication, uploaded as documents with the source set to their URL.
I figured I could save the HTML to a JSON file and upload it that way, but would need to set the source.
I believe I can't use Cheerio etc. because of the need to log in to the web site before scraping it (it's my own site).
The text was updated successfully, but these errors were encountered: