Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cupy.sign(NaN) == 0, unlike numpy #8327

Open
prutschman-iv opened this issue May 16, 2024 · 9 comments · May be fixed by #8330
Open

cupy.sign(NaN) == 0, unlike numpy #8327

prutschman-iv opened this issue May 16, 2024 · 9 comments · May be fixed by #8330

Comments

@prutschman-iv
Copy link

prutschman-iv commented May 16, 2024

Description

numpy.sign(numpy.nan) returns NaN, but cupy.sign(cupy.nan) returns 0.

I expected either the behavior to be the same, or the difference to be documented. (Edit: I see now that the different behavior is documented, but not explicitly contrasted with numpy in the documentation. Apologies if this is better suited as a documentation enhancement, although I would prefer the matching behavior.)

To Reproduce

import numpy as np
import cupy as cp
# Second assertion fails
assert np.isnan(np.sign(np.nan))
assert cp.isnan(cp.sign(cp.nan))

Installation

Wheel (pip install cupy-***)

Environment

OS                           : Windows-10-10.0.19045-SP0
Python Version               : 3.11.6
CuPy Version                 : 13.0.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.24.0
SciPy Version                : 1.11.2
Cython Build Version         : 0.29.36
Cython Runtime Version       : 3.0.2
CUDA Root                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2
nvcc PATH                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.EXE
CUDA Build Version           : 12020
CUDA Driver Version          : 12020
CUDA Runtime Version         : 12020 (linked to CuPy) / 12020 (locally installed)
cuBLAS Version               : (available)
cuFFT Version                : 11008
cuRAND Version               : 10303
cuSOLVER Version             : (11, 5, 0)
cuSPARSE Version             : (available)
NVRTC Version                : (12, 2)
Thrust Version               : 200200
CUB Build Version            : 200200
Jitify Build Version         : b0269c8
cuDNN Build Version          : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version                : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA GeForce RTX 4090
Device 0 Compute Capability  : 89
Device 0 PCI Bus ID          : 0000:C1:00.0

Additional Information

No response

@prutschman-iv prutschman-iv added the cat:bug Bugs label May 16, 2024
@prutschman-iv
Copy link
Author

It looks like the kernel is defined as
((x > 0) - (x < 0))

numpy uses the expression in1 > 0 ? 1 : (in1 < 0 ? -1 : (in1 == 0 ? 0 : in1)) code here

Some brief testing in godbolt shows that with nvcc 11.7, the numpy expression compiles into the same number or fewer of PTX instructions, without any branches. (Even fewer with floating point if the (in1 == 0 ? 0 : in1) is replaced with the equivalent in1.)

@kmaehashi
Copy link
Member

@prutschman-iv Thanks for reporting this one and also checking the performance impact! Indeed it's better to fix this one. Would you be interested in submitting a pull request to address this bug?

@prutschman-iv
Copy link
Author

I'll attempt to prepare a pull request with appropriate tests. For the tests, it looks like one approach would be adding something like this to test_misc.py. Does that seem right?

def test_sign_inf_nan(self):
    self.check_unary_inf_nan("sign")

(Also, I just realized that (in1 == 0 ? 0 : in1) isn't equivalent to in1 in the case of -0. So, the longer form of the expression is necessary to match numpy.)

@prutschman-iv
Copy link
Author

I must be misunderstanding some aspect of the test suite. I tried adding the test_sign_inf_nan and running the test suite with python setup.py test without fixing the bug, but I didn't get a test failure. I defined the test as assert False and didn't get a failure. So, I think my new test isn't getting run.

(I'm currently using the v12 branch because I couldn't get the main branch to build due to a problem with dlpack)

@kmaehashi
Copy link
Member

Thanks @prutschman-iv! The test looks legitimate to me. To run the test please use python -m pytest tests/cupy_tests/math_tests/test_misc.py. As for the dlpack issue, since the git submodule path has been changed between these branches, I'd suggest starting over from the fresh clone.

@prutschman-iv
Copy link
Author

I was successful in running the test suite for v12, thank you. From there I also observed a divergence between cupy and numpy for complex numbers, so I've fixed that as well. From the discussion linking to this one it sounds like there is maybe some controversy about the "correct" behavior? But my fix causes the behavior to match numpy's behavior as of 1.26.

The efficiency story isn't as rosy for the complex numbers case; it appears that the existing code was complicated enough to cause branches rather than predicated execution already and the extra test for nan does not improve the situation.

Unfortunately, I'm still having trouble building on main, even on a completely fresh clone into an new directory. For some reason the file cupy/_core/include\cupy/_dlpack/dlpack.h contains only the literal string ../../../../../third_party/dlpack/include/dlpack/dlpack.h I am hesitant to open the PR until I've verified correct operation on the main branch.

@leofang
Copy link
Member

leofang commented May 20, 2024

@prutschman-iv feel free to file a draft PR so that one of us can look at it and help you verify the issues :)

@prutschman-iv prutschman-iv linked a pull request May 20, 2024 that will close this issue
@kmaehashi
Copy link
Member

Unfortunately, I'm still having trouble building on main, even on a completely fresh clone into an new directory. For some reason the file cupy/_core/include\cupy/_dlpack/dlpack.h contains only the literal string ../../../../../third_party/dlpack/include/dlpack/dlpack.h I am hesitant to open the PR until I've verified correct operation on the main branch.

On Windows, symlink support needs to be enabled manually:
#7989 (comment)

@prutschman-iv
Copy link
Author

I'm not trying to pester, but I wanted to point out that I've removed the draft status from my PR, just in case that notification isn't automatic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants