-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
examples/matmul: fix host matrix printing & verification code #1480
Conversation
…ise fails with the fixed verification
This also contains the code for the stochastic verification, but it's not used anywhere. I just copied the entire file from my branch, that's why this is in there too. |
And one last note, I reduced the relative tolerance for float comparisons to 0.1 (previously was 0.5). I think 0.5 was way too high, completely wrong results still passed. Somewhere along the line, I think the vectorized matvec kernel started to fail silently because the verification with this huge tolerance let it slip through. So I temporarily swapped in the scalar kernel. |
I had numerous segfaults last time I tried to run the sweep script. Was this the cause? |
@fifield I would say this is the likely culprit, yes, since during a sweep there would be many non-square matrix sizes. Likely there still remain other mistakes. I haven't run the sweep in a while and have made some silly mistakes lately... I also just figured out why the vectorized matrix-vector didn't verify. I'll push the fix for that into this branch as well in a minute. Edit: I pushed the fix. I still had to crank relative tolerance up to 15% for it to pass, which seems high still. But maybe that's just the precision you get for bf16? |
Co-authored-by: Joseph Melber <jgmelber@gmail.com>
Joe, I just noticed the issues with verification you caught a couple weeks ago still are there. I think I pointed to the fix over e-mail but it looks like it never made it in.
I think it would be good to merge this quickly, as currently printing matrices with non-square sizes could lead to segfaults due to the errors this PR fixes.