Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests fail when run with pytest --forked #1132

Open
segyges opened this issue Jan 25, 2024 · 1 comment
Open

Tests fail when run with pytest --forked #1132

segyges opened this issue Jan 25, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@segyges
Copy link
Contributor

segyges commented Jan 25, 2024

Describe the bug
When tests are run with pytest --forked per the instructions in /test/README.md, a large number of tests fail with the error:

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

This appears to be a problem with the way tests are run in subprocesses. It makes testing, and therefore development on the library, rather difficult.

To Reproduce
Steps to reproduce the behavior:

  1. Install neox in your environment however you normally do
  2. Probably initialize a training run to make sure your environment is clean
  3. Exit out of that run
  4. cd /tests
  5. pytest --forked

Expected behavior
Tests pass, or fail for reasons to do with the code in the tests themselves.

Proposed solution
I have no idea.

Environment (please complete the following information):

  • GPUs: 2x 3090s
  • Configs: N/A

Additional context
forked-report.zip
Attached html report of the failures on the tests

@segyges segyges added the bug Something isn't working label Jan 25, 2024
@Quentin-Anthony
Copy link
Member

Currently sidestepping this with #1149 until we have time to more properly resolve the issue with launching CUDA in forked processes.

Some tests are back, all are cleaned a bit, and model training tests are skipped for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants