New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for custom exporter class #6273
base: master
Are you sure you want to change the base?
add support for custom exporter class #6273
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6273 +/- ##
==========================================
- Coverage 85.00% 84.96% -0.05%
==========================================
Files 161 162 +1
Lines 11962 12115 +153
Branches 1872 1840 -32
==========================================
+ Hits 10168 10293 +125
- Misses 1512 1534 +22
- Partials 282 288 +6
|
Hello, I like the idea of using the Feed Exporters rather than Feed Storages to do the actual exporting. You don't actually need to write any additional code to support this right now, this should be similiar to your solution: FEED = {
"stdout://": {
"format": "StreamingAPI"
}
}
FEED_EXPORTERS = {
"StreamingAPI": "StreamingAPIExporter"
} The methods However, there are several downsides that we would need to figure out:
|
Wouldn't this be something that the developer needs to take care of on their customized exporter code? If we're allowing the customization of
Couldn't all this be managed by the custom exporter class itself? If we assume that there's only a limited amount of exceptions that are covered by the engine, wouldn't this be something managed by the code itself? I think those questions can be answered better if we have an example of a customized exporter that matches those requirements. @VMRuiz what about the Airtable exporter we've previously discussed? |
70e95cc
to
e208f82
Compare
Reopening this since it closed when I reset the branch, because the changes were not necessary. I tried @VMRuiz proposal for custom exporters and runs just fine.
The failing tests confuse me, because from the error they seem to be related to |
Turns out without changing any Scrapy code I get the same behavior by calling |
…om the custom exporter side
@Gallaecio @wRAR , Do you think using deferToThread is appropriate here, or could there be performance issues if there are too many calls? |
I suggest using |
scrapy/extensions/feedexport.py
Outdated
@@ -481,11 +482,11 @@ def get_file(slot_): | |||
|
|||
if slot.itemcount: | |||
# Normal case | |||
slot.finish_exporting() | |||
finish_exporting_maybe_deferred = defer.maybeDeferred(slot.finish_exporting) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove maybe from the name, it's an actual deferred.
Note that this has conflicts with master. |
Opening this PR to discuss the implementation of adding support for custom exporter classes. This comes from the idea of adding stream exporters that don't rely on local storage. The usage I had in mind would be something like:
The batching would work the same, since a new instance would be created for every new slot and the custom exporter can handle the queuing, etc.
After this, reusable exporters for widely used services can be implemented.
Missing tests and docs.