Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate candidate string versions only once in get_applicable_candidates #12664

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

notatallshaw
Copy link
Contributor

@notatallshaw notatallshaw commented Apr 30, 2024

This is a minor performance, I measure it at 1% fairly consistently across different resolutions I've tried.

I was looking at the "After call graph" in #12663 and noticed that get_applicable_candidates was taking 16% of run time, and even though it was only called 921 times it is calling other methods hundreds of thousands of times.

The only obvious thing I spotted though is it effectively calculating [c.version for c in candidates] twice, and can be seen on this part of the call graph:

Highlighted call graph

image

I suspect though that this function has further significant optimization, so I will leave it as draft for now and think on it and take any suggestions, before marking it ready for review.

@ichard26 ichard26 added the type: performance Commands take too long to run label Apr 30, 2024
@notatallshaw notatallshaw force-pushed the Calculate-candidate-versions-once-in-`get_applicable_candidates` branch from eb7c1eb to 3ddec7f Compare May 3, 2024 22:57
@notatallshaw
Copy link
Contributor Author

Found one further minor improvement, specifier.filter returns an Iterable of the type you give it an Iterable of, so there's no need to stingify the output, it's already a string.

Other than that this logic is quite sensitive due to the complicated way prereleases work in the version spec, so I wasn't able to find any other big improvements, except when allow_prereleases = True, but it didn't seem worth adding a special path for this use case.

Marking ready for review.

@notatallshaw notatallshaw marked this pull request as ready for review May 3, 2024 23:09
@notatallshaw
Copy link
Contributor Author

notatallshaw commented May 4, 2024

Updated now Python 3.13 is passing CI

src/pip/_internal/index/package_finder.py Outdated Show resolved Hide resolved
news/12664.feature.rst Outdated Show resolved Hide resolved
@notatallshaw notatallshaw force-pushed the Calculate-candidate-versions-once-in-`get_applicable_candidates` branch from e0bcee2 to d845ad9 Compare June 2, 2024 20:58
# types. This way we'll use a str as a common data interchange
# format. If we stop using the pkg_resources provided specifier
# and start using our own, we can drop the cast to str().
candidates_and_versions = [(c, str(c.version)) for c in candidates]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know how much memory overhead this may induce in a large install? I agree this block can likely be further optimised since it is basically filtering on one list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants