We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qwen-audio + vad 运行报错
python qwen_demo.py
2024-05-14 11:09:35,110 - modelscope - INFO - PyTorch version 2.3.0 Found. 2024-05-14 11:09:35,110 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2024-05-14 11:09:35,135 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 7f17021ca099dd6760d43c7a9e69c36a and a total number of 976 components indexed Detect model requirements, begin to install it: /root/.cache/modelscope/hub/Qwen/Qwen-Audio/requirements.txt install model requirements successfully WARNING:transformers_modules.Qwen-Audio.modeling_qwen:The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". WARNING:transformers_modules.Qwen-Audio.modeling_qwen:Try importing flash-attention for faster inference... WARNING:transformers_modules.Qwen-Audio.modeling_qwen:Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary WARNING:transformers_modules.Qwen-Audio.modeling_qwen:Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm WARNING:transformers_modules.Qwen-Audio.modeling_qwen:Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 13.09it/s] audio_start_id: 155163, audio_end_id: 155164, audio_pad_id: 151851. 2024-05-14 11:09:42,213 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-05-14 11:09:42,213 - modelscope - INFO - Use user-specified model revision: master ckpt: /root/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt rtf_avg: 0.019: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 9.60it/s] 0%| | 0/1 [00:00<?, ?it/sTraceback (most recent call last): | 0/1 [00:00<?, ?it/s] File "/root/.cache/huggingface/modules/transformers_modules/Qwen-Audio/audio.py", line 91, in load_audio out = run(cmd, capture_output=True, check=True).stdout File "/root/miniconda3/envs/funasr/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'tensor([-0.0001, -0.0002, 0.0007, ..., 0.0000, 0.0000, 0.0000])', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "qwen_demo.py", line 18, in <module> res = model.generate(input=audio_in, prompt=prompt, batch_size_s=0,) File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 248, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 394, in inference_with_vad results = self.inference( File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 285, in inference res = model.inference(**batch, **kwargs) File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/models/qwen_audio/model.py", line 66, in inference audio_info = self.tokenizer.process_audio(query) File "/root/.cache/huggingface/modules/transformers_modules/Qwen-Audio/tokenization_qwen.py", line 556, in process_audio audio = load_audio(audio_path) File "/root/.cache/huggingface/modules/transformers_modules/Qwen-Audio/audio.py", line 93, in load_audio raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1) configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100 tensor([-0.0001, -0.0002, 0.0007, ..., 0.0000, 0.0000, 0.0000]): No such file or directory 0%| | 0/1 [00:00<?, ?it/s] 0%| | 0/1 [00:00<?, ?it/s]
qwen_demo.py
#!/usr/bin/env python3 # -*- encoding: utf-8 -*- # Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved. # MIT License (https://opensource.org/licenses/MIT) # To install requirements: pip3 install -U "funasr[llm]" from funasr import AutoModel model = AutoModel(model="Qwen-Audio", vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch", vad_kwargs={"max_single_segment_time": 30000}, ) audio_in = "asr_example_zh.wav" prompt = "<|startoftranscription|><|zh|><|transcribe|><|zh|><|notimestamps|><|wo_itn|>" res = model.generate(input=audio_in, prompt=prompt, batch_size_s=0,) print(res)
The text was updated successfully, but these errors were encountered:
on going
Sorry, something went wrong.
LauraGPT
No branches or pull requests
🐛 Bug
qwen-audio + vad 运行报错
To Reproduce
python qwen_demo.py
Code sample
qwen_demo.py
Environment
The text was updated successfully, but these errors were encountered: