Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OP Benchmark 测试算子性能的时候无法使用nsight system #1740

Open
bapijun opened this issue Sep 25, 2023 · 2 comments
Open

OP Benchmark 测试算子性能的时候无法使用nsight system #1740

bapijun opened this issue Sep 25, 2023 · 2 comments

Comments

@bapijun
Copy link

bapijun commented Sep 25, 2023

[PaddlePaddle OP Benchmark](https://github.com/PaddlePaddle/benchmark/tree/master/api)
在测试op的性能和精度的时候,无法正常的启动Nsight system,本人的显卡是3060,环境是docker环境下的latest-dev-cuda12.0-cudnn8.9-trt8.6-gcc12.2

@bapijun
Copy link
Author

bapijun commented Sep 25, 2023

提示的错误是:
Running Error:
======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher.
Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling.
Refer https://developer.nvidia.com/tools-overview for more details.

怀疑是检查显卡的时候认定为低版本的显卡

@Xreki
Copy link
Collaborator

Xreki commented Sep 25, 2023

def is_ampere_gpu():
stdout, exit_code = system.run_command("nvidia-smi -L")
if exit_code == 0:
gpu_list = stdout.split("\n")
if len(gpu_list) >= 1:
#print(gpu_list[0])
# GPU 0: NVIDIA A100-SXM4-40GB (UUID: xxxx)
return gpu_list[0].find("A100") > 0
return False

如上述代码所示,op benchmark已经支持了使用nsys得到GPU时间,但是当前代码里面判断是否ampere架构实现比较native,可通过简单的修改在3060上使用。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants