Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is_item can be costly #59

Open
Gallaecio opened this issue Mar 12, 2022 · 4 comments
Open

is_item can be costly #59

Gallaecio opened this issue Mar 12, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@Gallaecio
Copy link
Member

See scrapy/itemloaders#50

While that issue can be addressed in a different way, I am thinking that we should make some work to minimize the CPU cost of this function.

I am actually starting to consider making different item types opt-in; i.e. instead of having all supported item types enabled by default, requiring users to enable them for them to work.

I am just thinking out loud, though. I am just worried about the CPU-consumption of this function going forward, seeing as it is already able to cause a 50% additional CPU usage for a process when overused.

@Gallaecio Gallaecio added the enhancement New feature or request label Mar 12, 2022
@elacuesta
Copy link
Member

elacuesta commented Mar 12, 2022

I remember a discussion about imports inside functions being expensive. Let's see if #60 helps in that regard.
Additionally, it might be slightly better to use itemadapter.ItemAdapter.is_item instead of itemadapter.is_item in itemloaders, since the latter actually imports the former.

@elacuesta
Copy link
Member

I am actually starting to consider making different item types opt-in; i.e. instead of having all supported item types enabled by default, requiring users to enable them for them to work.

I just realized this is actually possible at the moment, by modifying the ItemAdapter.ADAPTER_CLASSES class attribute, e.g. removing the ones you are sure you don't need and/or adjusting the order to break early if you know the majority of your items are of a certain kind.

@Gallaecio
Copy link
Member Author

I am actually starting to consider making different item types opt-in; i.e. instead of having all supported item types enabled by default, requiring users to enable them for them to work.

I just realized this is actually possible at the moment, by modifying the ItemAdapter.ADAPTER_CLASSES class attribute, e.g. removing the ones you are sure you don't need and/or adjusting the order to break early if you know the majority of your items are of a certain kind.

Indeed, they are currently opt-out. Making them opt-in would mean making ItemAdapter.ADAPTER_CLASSES empty or at least less crowded by default. But it could be argued that the user experience would be worse, not to mention that making such a change in a backward-compatible way could be a pain.

@elacuesta
Copy link
Member

Released v0.5.0 to get advantage of the performance improvements. Let's keep this issue open to investigate about further improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants