Inference with cuda #969

hieuhv94 · 2021-04-28T04:16:25Z

I run inference streaming code in CUDA environment (flashlight and wav2letter with CUDA) but results was the same between cpu and cuda.
And i have a question, how to run inference with CUDA?
Thanks!

tlikhomanenko · 2021-04-28T22:31:23Z

cc @vineelpratap

hieuhv94 · 2021-05-06T03:34:06Z

Any suggestion for this??

vineelpratap · 2021-05-06T05:47:53Z

Hey @hieuhv94 , sorry I didn't get the question.

For running inference with CUDA, you can use fl_asr_decode binary which will do beam search decoding with a LM.

If you meant using streaming inference from w2l@anywhere, we currently support only CPU version for now.

hieuhv94 · 2021-05-24T09:16:48Z

Hey @hieuhv94 , sorry I didn't get the question.

For running inference with CUDA, you can use fl_asr_decode binary which will do beam search decoding with a LM.

If you meant using streaming inference from w2l@anywhere, we currently support only CPU version for now.

Thank for your comment @vineelpratap
Do you have any ideal for streaming inferences with CUDA?
And performance of streaming inference with cuda is better than cpu or not? Because, we must copy multi chunks to vram in streaming progress.

hieuhv94 added the enhancement label Apr 28, 2021

tlikhomanenko self-assigned this Apr 28, 2021

tlikhomanenko assigned vineelpratap May 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference with cuda #969

Inference with cuda #969

hieuhv94 commented Apr 28, 2021 •

edited

tlikhomanenko commented Apr 28, 2021

hieuhv94 commented May 6, 2021

vineelpratap commented May 6, 2021

hieuhv94 commented May 24, 2021

Inference with cuda #969

Inference with cuda #969

Comments

hieuhv94 commented Apr 28, 2021 • edited

tlikhomanenko commented Apr 28, 2021

hieuhv94 commented May 6, 2021

vineelpratap commented May 6, 2021

hieuhv94 commented May 24, 2021

hieuhv94 commented Apr 28, 2021 •

edited