We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
我用一个语音离线生成spk,但是发现key与sentence_info内容不能匹配,sentence_info只有key的一半内容。
Steps to reproduce the behavior (always include the command you ran):
from funasr import AutoModel
model = AutoModel(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc", spk_model="cam++" ) res = model.generate(input="2speakers_example.wav", batch_size_s=1, hotword='魔搭') print(res)
'key': 'rand_key_2yW4Acq9GFz6Y', 'text': '嗯,那么今天我们就简单的进行一下那个新生招聘的嗯讨论吧。因为现在不是马上就新生到校嘛,然后我们社团呢也需要招聘一些新的社员,然后就今天就大概就讨论一下嗯怎么招聘的内容吧。嗯,我们就首先想一下那个招聘的地点在哪里吧。嗯地点的话我们现在可以有三个选择。嗯,第一个的话我们可以选择在操场,因为那儿嗯学生流动量也挺大的。操场的话这这段时间太热了,我怕那个人流量有点少。嗯,那我们还可以有第二个选择呀。嗯,我们可以在图书馆楼下那里有一块可以遮阴的地方哦,图书馆我觉得应该还可以吧。嗯,就怕那些嗯新生我应该也会去吧。因为他如果刚刚到校,他应该就第一选择。如果是我的话,我也比较想去那个图书馆,还有什么地方呢?嗯,第三个的话,我们可以在演播厅底下,因为现在那里就已经有了很多社团在招新,然后我们过去的话也算。'
'sentence_info': [{'text': '嗯,', 'start': 5570, 'end': 5810, 'timestamp': [[5570, 5810]], 'spk': 0}, {'text': '那么今天我们就简单的进行一下那。', 'start': 5810, 'end': 8630, 'timestamp': [[5810, 5950], [5950, 6150], [6150, 6230], [6230, 6470], [6490, 6650], [6650, 6850], [6850, 7090], [7230, 7370], [7370, 7550], [7550, 7750], [7750, 7850], [7850, 8090], [8110, 8210], [8210, 8430], [8430, 8630]], 'spk': 0}, {'text': '个新生招聘的嗯讨,', 'start': 8630, 'end': 11630, 'timestamp': [[8630, 8870], [8910, 9150], [9170, 9410], [9510, 9750], [9770, 10010], [10010, 10250], [10690, 10930], [11390, 11630]], 'spk': 0}, {'text': '论吧因为现在不是马上,', 'start': 11630, 'end': 13890, 'timestamp': [[11630, 11870], [11870, 12070], [12070, 12210], [12210, 12370], [12370, 12510], [12510, 12690], [12690, 12810], [12810, 13050], [13550, 13770], [13770, 13890]], 'spk': 0}, {'text': '就新生到校嘛然后我们社团呢。', 'start': 13890, 'end': 16590, 'timestamp': [[13890, 14090], [14090, 14250], [14250, 14490], [14510, 14730], [14730, 14970], [15030, 15270], [15530, 15690], [15690, 15810], [15810, 15930], [15930, 16070], [16070, 16230], [16230, 16410], [16410, 16590]], 'spk': 0}, {'text': '也,', 'start': 16590, 'end': 16750, 'timestamp': [[16590, 16750]], 'spk': 0}, {'text': '需要招聘一些新的社员然。', 'start': 16750, 'end': 19330, 'timestamp': [[16750, 16890], [16890, 17090], [17090, 17230], [17230, 17450], [17450, 17590], [17590, 17730], [17730, 17950], [17950, 18150], [18150, 18370], [18370, 18610], [19130, 19330]], 'spk': 0}, {'text': '后就今天就大概就讨。', 'start': 19330, 'end': 21130, 'timestamp': [[19330, 19510], [19510, 19750], [19770, 19930], [19930, 20110], [20110, 20350], [20370, 20590], [20590, 20710], [20710, 20930], [20930, 21130]], 'spk': 0}, {'text': '论,', 'start': 21130, 'end': 21369, 'timestamp': [[21130, 21369]], 'spk': 0}, {'text': '一下嗯怎么招聘,', 'start': 21369, 'end': 23150, 'timestamp': [[21389, 21490], [21490, 21730], [22090, 22330], [22450, 22570], [22570, 22710], [22710, 22910], [22910, 23150]], 'spk': 0}, {'text': '的内容吧嗯我们就首。', 'start': 23150, 'end': 25430, 'timestamp': [[23150, 23390], [23430, 23570], [23570, 23810], [23810, 24050], [24430, 24670], [24730, 24830], [24830, 24950], [24950, 25190], [25230, 25430]], 'spk': 0}, {'text': '先想一下那个,', 'start': 25430, 'end': 26570, 'timestamp': [[25430, 25670], [25750, 25930], [25930, 26030], [26030, 26170], [26170, 26330], [26330, 26570]], 'spk': 0}, {'text': '招聘的地点。', 'start': 26570, 'end': 27770, 'timestamp': [[26790, 27030], [27050, 27230], [27230, 27370], [27370, 27530], [27530, 27770]], 'spk': 0}, {'text': '在,', 'start': 27770, 'end': 27950, 'timestamp': [[27770, 27950]], 'spk': 0}, {'text': '哪里吧嗯地点的话。', 'start': 27950, 'end': 30400, 'timestamp': [[27950, 28130], [28130, 28210], [28210, 28695], [29540, 29760], [29760, 29920], [29920, 30120], [30120, 30220], [30220, 30400]], 'spk': 1}, {'text': '我,', 'start': 30400, 'end': 30480, 'timestamp': [[30400, 30480]], 'spk': 1}, {'text': '们现在可以有三个选择嗯第,', 'start': 30480, 'end': 33180, 'timestamp': [[30480, 30600], [30600, 30820], [30820, 31060], [31160, 31280], [31280, 31380], [31380, 31540], [31540, 31700], [31700, 31900], [31900, 32120], [32120, 32360], [32780, 33020], [33080, 33180]], 'spk': 1}, {'text': '一个的话我们可。', 'start': 33180, 'end': 34020, 'timestamp': [[33180, 33300], [33300, 33440], [33440, 33540], [33540, 33720], [33720, 33800], [33800, 33900], [33900, 34020]], 'spk': 1}, {'text': '以,', 'start': 34020, 'end': 34120, 'timestamp': [[34020, 34120]], 'spk': 1}, {'text': '选择在操场因为那。', 'start': 34120, 'end': 36760, 'timestamp': [[34120, 34300], [34300, 34540], [34620, 34860], [35480, 35720], [35740, 35980], [36140, 36280], [36280, 36520], [36520, 36760]], 'spk': 1}, {'text': '儿嗯学生,', 'start': 36760, 'end': 38610, 'timestamp': [[36760, 37115], [37770, 38010], [38190, 38410], [38410, 38610]], 'spk': 1}, {'text': '流动量也挺。', 'start': 38610, 'end': 39410, 'timestamp': [[38610, 38770], [38770, 38870], [38870, 39070], [39070, 39190], [39190, 39410]], 'spk': 1}, {'text': '大的操,', 'start': 39410, 'end': 40330, 'timestamp': [[39410, 39650], [39650, 39890], [40090, 40330]], 'spk': 1}, {'text': '场的话这这段,', 'start': 40330, 'end': 42270, 'timestamp': [[40370, 40610], [40630, 40730], [40730, 40970], [41510, 41750], [41890, 42050], [42050, 42270]], 'spk': 0}, {'text': '时间太?', 'start': 42270, 'end': 42890, 'timestamp': [[42270, 42470], [42470, 42670], [42670, 42890]], 'spk': 0}, {'text': '热,', 'start': 42890, 'end': 43130, 'timestamp': [[42890, 43130]], 'spk': 0}, {'text': '了我,', 'start': 43130, 'end': 43550, 'timestamp': [[43190, 43410], [43410, 43550]], 'spk': 0}, {'text': '怕那个人,', 'start': 43550, 'end': 44750, 'timestamp': [[43550, 43790], [44290, 44450], [44450, 44590], [44590, 44750]], 'spk': 0}, {'text': '流量有点少嗯那我们还可,', 'start': 44750, 'end': 47290, 'timestamp': [[44750, 44930], [44930, 45170], [45210, 45350], [45350, 45530], [45530, 45770], [46270, 46510], [46550, 46750], [46750, 46830], [46830, 46970], [46970, 47190], [47190, 47290]], 'spk': 0}, {'text': '以有第二个选。', 'start': 47290, 'end': 47970, 'timestamp': [[47290, 47370], [47370, 47450], [47450, 47550], [47550, 47650], [47650, 47770], [47770, 47970]], 'spk': 1}]
2024-05-09 23:53:51,045 - modelscope - WARNING - Model revision not specified, use revision: v2.0.9 ckpt: /mnt/workspace/.cache/modelscope/damo/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt 2024-05-09 23:53:53,480 - modelscope - WARNING - Model revision not specified, use revision: v2.0.4 ckpt: /mnt/workspace/.cache/modelscope/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt 2024-05-09 23:53:53,989 - modelscope - WARNING - Model revision not specified, use revision: v2.0.4 ckpt: /mnt/workspace/.cache/modelscope/damo/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt 2024-05-09 23:53:57,279 - modelscope - WARNING - Model revision not specified, use revision: v2.0.2 ckpt: /mnt/workspace/.cache/modelscope/damo/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
The text was updated successfully, but these errors were encountered:
Environment: OS: Linux FunASR Version: 1.0.14 PyTorch version 2.1.2+cu121 How you installed funasr: from funasr import AutoModel,在modelscope的notebook中执行这个代码自动安装的 Python version: 3.10.13 GPU: NVIDIA A10 CUDA/cuDNN version: cuda_12.1.r12.1
录音
Uploading 2speakers_example.zip…
Sorry, something went wrong.
R1ckShi
No branches or pull requests
Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
🐛 Bug
我用一个语音离线生成spk,但是发现key与sentence_info内容不能匹配,sentence_info只有key的一半内容。
To Reproduce
Steps to reproduce the behavior (always include the command you ran):
from funasr import AutoModel
paraformer-zh is a multi-functional asr model
use vad, punc, spk or not as you need
model = AutoModel(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc", spk_model="cam++"
)
res = model.generate(input="2speakers_example.wav",
batch_size_s=1,
hotword='魔搭')
print(res)
'key': 'rand_key_2yW4Acq9GFz6Y', 'text': '嗯,那么今天我们就简单的进行一下那个新生招聘的嗯讨论吧。因为现在不是马上就新生到校嘛,然后我们社团呢也需要招聘一些新的社员,然后就今天就大概就讨论一下嗯怎么招聘的内容吧。嗯,我们就首先想一下那个招聘的地点在哪里吧。嗯地点的话我们现在可以有三个选择。嗯,第一个的话我们可以选择在操场,因为那儿嗯学生流动量也挺大的。操场的话这这段时间太热了,我怕那个人流量有点少。嗯,那我们还可以有第二个选择呀。嗯,我们可以在图书馆楼下那里有一块可以遮阴的地方哦,图书馆我觉得应该还可以吧。嗯,就怕那些嗯新生我应该也会去吧。因为他如果刚刚到校,他应该就第一选择。如果是我的话,我也比较想去那个图书馆,还有什么地方呢?嗯,第三个的话,我们可以在演播厅底下,因为现在那里就已经有了很多社团在招新,然后我们过去的话也算。'
'sentence_info': [{'text': '嗯,', 'start': 5570, 'end': 5810, 'timestamp': [[5570, 5810]], 'spk': 0}, {'text': '那么今天我们就简单的进行一下那。', 'start': 5810, 'end': 8630, 'timestamp': [[5810, 5950], [5950, 6150], [6150, 6230], [6230, 6470], [6490, 6650], [6650, 6850], [6850, 7090], [7230, 7370], [7370, 7550], [7550, 7750], [7750, 7850], [7850, 8090], [8110, 8210], [8210, 8430], [8430, 8630]], 'spk': 0}, {'text': '个新生招聘的嗯讨,', 'start': 8630, 'end': 11630, 'timestamp': [[8630, 8870], [8910, 9150], [9170, 9410], [9510, 9750], [9770, 10010], [10010, 10250], [10690, 10930], [11390, 11630]], 'spk': 0}, {'text': '论吧因为现在不是马上,', 'start': 11630, 'end': 13890, 'timestamp': [[11630, 11870], [11870, 12070], [12070, 12210], [12210, 12370], [12370, 12510], [12510, 12690], [12690, 12810], [12810, 13050], [13550, 13770], [13770, 13890]], 'spk': 0}, {'text': '就新生到校嘛然后我们社团呢。', 'start': 13890, 'end': 16590, 'timestamp': [[13890, 14090], [14090, 14250], [14250, 14490], [14510, 14730], [14730, 14970], [15030, 15270], [15530, 15690], [15690, 15810], [15810, 15930], [15930, 16070], [16070, 16230], [16230, 16410], [16410, 16590]], 'spk': 0}, {'text': '也,', 'start': 16590, 'end': 16750, 'timestamp': [[16590, 16750]], 'spk': 0}, {'text': '需要招聘一些新的社员然。', 'start': 16750, 'end': 19330, 'timestamp': [[16750, 16890], [16890, 17090], [17090, 17230], [17230, 17450], [17450, 17590], [17590, 17730], [17730, 17950], [17950, 18150], [18150, 18370], [18370, 18610], [19130, 19330]], 'spk': 0}, {'text': '后就今天就大概就讨。', 'start': 19330, 'end': 21130, 'timestamp': [[19330, 19510], [19510, 19750], [19770, 19930], [19930, 20110], [20110, 20350], [20370, 20590], [20590, 20710], [20710, 20930], [20930, 21130]], 'spk': 0}, {'text': '论,', 'start': 21130, 'end': 21369, 'timestamp': [[21130, 21369]], 'spk': 0}, {'text': '一下嗯怎么招聘,', 'start': 21369, 'end': 23150, 'timestamp': [[21389, 21490], [21490, 21730], [22090, 22330], [22450, 22570], [22570, 22710], [22710, 22910], [22910, 23150]], 'spk': 0}, {'text': '的内容吧嗯我们就首。', 'start': 23150, 'end': 25430, 'timestamp': [[23150, 23390], [23430, 23570], [23570, 23810], [23810, 24050], [24430, 24670], [24730, 24830], [24830, 24950], [24950, 25190], [25230, 25430]], 'spk': 0}, {'text': '先想一下那个,', 'start': 25430, 'end': 26570, 'timestamp': [[25430, 25670], [25750, 25930], [25930, 26030], [26030, 26170], [26170, 26330], [26330, 26570]], 'spk': 0}, {'text': '招聘的地点。', 'start': 26570, 'end': 27770, 'timestamp': [[26790, 27030], [27050, 27230], [27230, 27370], [27370, 27530], [27530, 27770]], 'spk': 0}, {'text': '在,', 'start': 27770, 'end': 27950, 'timestamp': [[27770, 27950]], 'spk': 0}, {'text': '哪里吧嗯地点的话。', 'start': 27950, 'end': 30400, 'timestamp': [[27950, 28130], [28130, 28210], [28210, 28695], [29540, 29760], [29760, 29920], [29920, 30120], [30120, 30220], [30220, 30400]], 'spk': 1}, {'text': '我,', 'start': 30400, 'end': 30480, 'timestamp': [[30400, 30480]], 'spk': 1}, {'text': '们现在可以有三个选择嗯第,', 'start': 30480, 'end': 33180, 'timestamp': [[30480, 30600], [30600, 30820], [30820, 31060], [31160, 31280], [31280, 31380], [31380, 31540], [31540, 31700], [31700, 31900], [31900, 32120], [32120, 32360], [32780, 33020], [33080, 33180]], 'spk': 1}, {'text': '一个的话我们可。', 'start': 33180, 'end': 34020, 'timestamp': [[33180, 33300], [33300, 33440], [33440, 33540], [33540, 33720], [33720, 33800], [33800, 33900], [33900, 34020]], 'spk': 1}, {'text': '以,', 'start': 34020, 'end': 34120, 'timestamp': [[34020, 34120]], 'spk': 1}, {'text': '选择在操场因为那。', 'start': 34120, 'end': 36760, 'timestamp': [[34120, 34300], [34300, 34540], [34620, 34860], [35480, 35720], [35740, 35980], [36140, 36280], [36280, 36520], [36520, 36760]], 'spk': 1}, {'text': '儿嗯学生,', 'start': 36760, 'end': 38610, 'timestamp': [[36760, 37115], [37770, 38010], [38190, 38410], [38410, 38610]], 'spk': 1}, {'text': '流动量也挺。', 'start': 38610, 'end': 39410, 'timestamp': [[38610, 38770], [38770, 38870], [38870, 39070], [39070, 39190], [39190, 39410]], 'spk': 1}, {'text': '大的操,', 'start': 39410, 'end': 40330, 'timestamp': [[39410, 39650], [39650, 39890], [40090, 40330]], 'spk': 1}, {'text': '场的话这这段,', 'start': 40330, 'end': 42270, 'timestamp': [[40370, 40610], [40630, 40730], [40730, 40970], [41510, 41750], [41890, 42050], [42050, 42270]], 'spk': 0}, {'text': '时间太?', 'start': 42270, 'end': 42890, 'timestamp': [[42270, 42470], [42470, 42670], [42670, 42890]], 'spk': 0}, {'text': '热,', 'start': 42890, 'end': 43130, 'timestamp': [[42890, 43130]], 'spk': 0}, {'text': '了我,', 'start': 43130, 'end': 43550, 'timestamp': [[43190, 43410], [43410, 43550]], 'spk': 0}, {'text': '怕那个人,', 'start': 43550, 'end': 44750, 'timestamp': [[43550, 43790], [44290, 44450], [44450, 44590], [44590, 44750]], 'spk': 0}, {'text': '流量有点少嗯那我们还可,', 'start': 44750, 'end': 47290, 'timestamp': [[44750, 44930], [44930, 45170], [45210, 45350], [45350, 45530], [45530, 45770], [46270, 46510], [46550, 46750], [46750, 46830], [46830, 46970], [46970, 47190], [47190, 47290]], 'spk': 0}, {'text': '以有第二个选。', 'start': 47290, 'end': 47970, 'timestamp': [[47290, 47370], [47370, 47450], [47450, 47550], [47550, 47650], [47650, 47770], [47770, 47970]], 'spk': 1}]
Code sample
Expected behavior
Environment
2024-05-09 23:53:51,045 - modelscope - WARNING - Model revision not specified, use revision: v2.0.9
ckpt: /mnt/workspace/.cache/modelscope/damo/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
2024-05-09 23:53:53,480 - modelscope - WARNING - Model revision not specified, use revision: v2.0.4
ckpt: /mnt/workspace/.cache/modelscope/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
2024-05-09 23:53:53,989 - modelscope - WARNING - Model revision not specified, use revision: v2.0.4
ckpt: /mnt/workspace/.cache/modelscope/damo/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
2024-05-09 23:53:57,279 - modelscope - WARNING - Model revision not specified, use revision: v2.0.2
ckpt: /mnt/workspace/.cache/modelscope/damo/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
Additional context
The text was updated successfully, but these errors were encountered: