You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
problem appears on the speechbrain.inference.classifiers
when run classify_batch, the self.mods report "AttributeError: 'ModuleDict' object has no attribute 'compute_features'"
dient_checkpointingto a config initialization is deprecated and will be removed in v5 Transformers. Usingmodel.gradient_checkpointing_enable()instead, or if you are using theTrainerAPI, passgradient_checkpointing=Truein yourTrainingArguments`.
warnings.warn(
speechbrain.lobes.models.huggingface_transformers.huggingface - Wav2Vec2Model is frozen.
Traceback (most recent call last):
File "C:\Users\JGanson\Desktop\DeepLearning\ASR\code\torch\wav2vec_speechbrain.py", line 18, in
prediction = model.classify_batch(signal)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\speechbrain\inference\classifiers.py", line 249, in classify_batch
X_stft = self.mods.compute_stft(wavs)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'ModuleDict' object has no attribute 'compute_stft
Expected behaviour
my code just simple loading the pretrained model and run, so it shouldn't get this bug
To Reproduce
from speechbrain.inference.classifiers import AudioClassifier
import numpy as np
import torch
Load the pre-trained model
model = AudioClassifier.from_hparams(source="speechbrain/emotion-recognition-wav2vec2-IEMOCAP", savedir="tmpdir")
data = np.load("data/IEMOCAP_audio.npy",allow_pickle=True).item()
model.hparams.label_encoder.ignore_len()
audio = data['audio']
input = audio[0]
signal = torch.tensor(input)
signal = signal.unsqueeze(0) # Add a batch dimension
Hi, the reason for this error is that you are not using the correct inference class to use speechbrain/emotion-recognition-wav2vec2-IEMOCAP. This model, as you can see from the code snippet here uses a specific feature of speechbrain inference -- custom inferences. When a model requires a custom inference that cannot be generalised to a usefull class. we allow users or the maintainers to add a custom inference.py You can find this inference if you naviguate into the files of the HuggingFace repo.
Note to developpers:
This model was done by @aheba I don't know if he is still around and would like to, instead, integrate this as a standard inference in our lobes. I think it should be there to be honest. I think wav2vec2 or SSL bases classifier should have their own inference class in lobes. @Adel-Moumen, adding this to your infinite to-do list?
Describe the bug
problem appears on the speechbrain.inference.classifiers
when run classify_batch, the self.mods report "AttributeError: 'ModuleDict' object has no attribute 'compute_features'"
dient_checkpointing
to a config initialization is deprecated and will be removed in v5 Transformers. Using
model.gradient_checkpointing_enable()instead, or if you are using the
TrainerAPI, pass
gradient_checkpointing=Truein your
TrainingArguments`.warnings.warn(
speechbrain.lobes.models.huggingface_transformers.huggingface - Wav2Vec2Model is frozen.
Traceback (most recent call last):
File "C:\Users\JGanson\Desktop\DeepLearning\ASR\code\torch\wav2vec_speechbrain.py", line 18, in
prediction = model.classify_batch(signal)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\speechbrain\inference\classifiers.py", line 249, in classify_batch
X_stft = self.mods.compute_stft(wavs)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'ModuleDict' object has no attribute 'compute_stft
Expected behaviour
my code just simple loading the pretrained model and run, so it shouldn't get this bug
To Reproduce
from speechbrain.inference.classifiers import AudioClassifier
import numpy as np
import torch
Load the pre-trained model
model = AudioClassifier.from_hparams(source="speechbrain/emotion-recognition-wav2vec2-IEMOCAP", savedir="tmpdir")
data = np.load("data/IEMOCAP_audio.npy",allow_pickle=True).item()
model.hparams.label_encoder.ignore_len()
audio = data['audio']
input = audio[0]
signal = torch.tensor(input)
signal = signal.unsqueeze(0) # Add a batch dimension
prediction = model.classify_batch(signal)
Environment Details
torch 2.0.1+cu117 pypi_0 pypi
torchaudio 2.0.2+cu117 pypi_0 pypi
torchvision 0.15.2+cu117 pypi_0 pypi
speechbrain 1.0.0 pypi_0 pypi
Relevant Log Output
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: