speechbrain inference classifier error #2457

JGanson · 2024-03-11T19:32:33Z

Describe the bug

problem appears on the speechbrain.inference.classifiers

when run classify_batch, the self.mods report "AttributeError: 'ModuleDict' object has no attribute 'compute_features'"

dient_checkpointingto a config initialization is deprecated and will be removed in v5 Transformers. Usingmodel.gradient_checkpointing_enable()instead, or if you are using theTrainerAPI, passgradient_checkpointing=Truein yourTrainingArguments`.
warnings.warn(
speechbrain.lobes.models.huggingface_transformers.huggingface - Wav2Vec2Model is frozen.
Traceback (most recent call last):
File "C:\Users\JGanson\Desktop\DeepLearning\ASR\code\torch\wav2vec_speechbrain.py", line 18, in
prediction = model.classify_batch(signal)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\speechbrain\inference\classifiers.py", line 249, in classify_batch
X_stft = self.mods.compute_stft(wavs)
File "C:\Users\JGanson\anaconda3\envs\nlp\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'ModuleDict' object has no attribute 'compute_stft

Expected behaviour

my code just simple loading the pretrained model and run, so it shouldn't get this bug

To Reproduce

from speechbrain.inference.classifiers import AudioClassifier
import numpy as np
import torch

Load the pre-trained model

model = AudioClassifier.from_hparams(source="speechbrain/emotion-recognition-wav2vec2-IEMOCAP", savedir="tmpdir")
data = np.load("data/IEMOCAP_audio.npy",allow_pickle=True).item()
model.hparams.label_encoder.ignore_len()

audio = data['audio']

input = audio[0]

signal = torch.tensor(input)

signal = signal.unsqueeze(0) # Add a batch dimension

prediction = model.classify_batch(signal)

Environment Details

torch 2.0.1+cu117 pypi_0 pypi
torchaudio 2.0.2+cu117 pypi_0 pypi
torchvision 0.15.2+cu117 pypi_0 pypi
speechbrain 1.0.0 pypi_0 pypi

Relevant Log Output

No response

Additional Context

No response

The text was updated successfully, but these errors were encountered:

TParcollet · 2024-04-26T13:19:40Z

Hi, the reason for this error is that you are not using the correct inference class to use speechbrain/emotion-recognition-wav2vec2-IEMOCAP. This model, as you can see from the code snippet here uses a specific feature of speechbrain inference -- custom inferences. When a model requires a custom inference that cannot be generalised to a usefull class. we allow users or the maintainers to add a custom inference.py You can find this inference if you naviguate into the files of the HuggingFace repo.

Note to developpers:
This model was done by @aheba I don't know if he is still around and would like to, instead, integrate this as a standard inference in our lobes. I think it should be there to be honest. I think wav2vec2 or SSL bases classifier should have their own inference class in lobes. @Adel-Moumen, adding this to your infinite to-do list?

@

JGanson added the bug Something isn't working label Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speechbrain inference classifier error #2457

speechbrain inference classifier error #2457

JGanson commented Mar 11, 2024

TParcollet commented Apr 26, 2024

speechbrain inference classifier error #2457

speechbrain inference classifier error #2457

Comments

JGanson commented Mar 11, 2024

Describe the bug

Expected behaviour

To Reproduce

Load the pre-trained model

Environment Details

Relevant Log Output

Additional Context

TParcollet commented Apr 26, 2024