Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install files in parallel #1256

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

Thomas1664
Copy link
Contributor

@Thomas1664 Thomas1664 commented Oct 29, 2023

Depending on how many files to install, this change results in a >= 4x speed up. I only tested relatively small ports but I expect even higher speed ups for bigger ports.

  1. Get file status in parallel
  2. Write listfile (async)
  3. Create all directories
  4. Copy files in parallel

Comment on lines 205 to 209
#ifdef _WIN32
std::sort(std::execution::par_unseq, output.begin(), output.end());
#else
std::sort(output.begin(), output.end());
#endif
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open for suggestions on how to implement parallel sorting for all platforms

Comment on lines 120 to 161
fs.remove_all(target, IgnoreErrors{});
fs.remove(target, IgnoreErrors{});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no difference for regular files between remove_all and remove (provided that vcpkg's implementation is identical to the STL): https://en.cppreference.com/w/cpp/filesystem/remove

@Thomas1664 Thomas1664 marked this pull request as draft October 30, 2023 11:05
@Thomas1664
Copy link
Contributor Author

Thomas1664 commented Oct 30, 2023

@BillyONeal Unit tests on OSX failed here. It seems to work again but I can consistently reproduce the failure on my local machine: The test case captures-output fails with error: calling CreateProcessW failed with 2 (unknown error). I'm not sure if this is the same test that failed in CI. I can confirm that this is most likely not caused by changes in this PR because the changed code inside commands.install.cpp is never hit.

@Thomas1664 Thomas1664 marked this pull request as ready for review October 30, 2023 15:22
@autoantwort
Copy link
Contributor

Btw you can build the format target locally to format all code.

@Thomas1664 Thomas1664 mentioned this pull request Oct 30, 2023
@BillyONeal BillyONeal reopened this Oct 30, 2023
@BillyONeal
Copy link
Member

Sorry!

Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in general happy with attempts to parallelize things but overall I think you need to be much more explicit in proving that everything you call in a parallel context is actually safe to do. Given that the first run of e2e tests experienced a failure and I found races in a cursory review makes me fear that unknown races remain without such a proof.

This means if you want to proceed here I think you need to show strong thread safety requirements of every API you're calling rather than taking large blocks of existing code and wrapping them in parallel constructs.

}
}

fs.create_directories(listfile.parent_path(), VCPKG_LINE_INFO);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This races with other file operations you're doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The list file is not in the same folder where the installed files are. And the "destination directory" is created beforehand. There is no race condition for creating /installed because it already exists (line 163). The listfiles go in /installed/vcpkg/info and the installed files will go in installed/triplet.

src/vcpkg/commands.install.cpp Outdated Show resolved Hide resolved
src/vcpkg/commands.install.cpp Outdated Show resolved Hide resolved

parallel_transform(files, output.begin(), [&](auto&& file) -> PathAndType {
std::error_code ec;
auto status = fs.symlink_status(file, ec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly suspect the only thing you're gaining from parallelism here is parallelizing this symlink_status and this would be much clearer if you only parallelized that rather than trying to make invalid states that need to be chopped off etc.

If you parallelize just this part you could use plain Util::erase_remove_if for the invalid ones. It would also have the advantage of not trying to parallelize println_warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I want to get rid of invalid states I'd have to use parallel_for_each but this would require a lock on when inserting things into the destination vector. The only thing I can move out of this is the validation of the file type.

src/vcpkg/commands.install.cpp Show resolved Hide resolved
src/vcpkg/commands.install.cpp Outdated Show resolved Hide resolved
if (ec)
{
msg::println_error(msgInstallFailed, msg::path = target, msg::error_msg = ec.message());
Debug::println("Install from packages to installed: Fallback to copy "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug::println is not thread safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is other parallelized code where this also happens: in cmd_execute_and_capture_output_parallel we call cmd_execute_and_stream_data which calls Debug::print().

src/vcpkg/commands.install.cpp Outdated Show resolved Hide resolved
@Thomas1664
Copy link
Contributor Author

Thomas1664 commented Nov 28, 2023

I'm in general happy with attempts to parallelize things but overall I think you need to be much more explicit in proving that everything you call in a parallel context is actually safe to do. Given that the first run of e2e tests experienced a failure and I found races in a cursory review makes me fear that unknown races remain without such a proof.

I think the main issue with parallelizing here is printing. Basically, whenever the program can print anything, needs to be guarded by a mutex which is IMO not possible from the outside. Therefore, we need to make printing inherently thread safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants