Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues regarding discrete WavLM and discrete HuBERT #2453

Open
anupsingh15 opened this issue Mar 6, 2024 · 4 comments
Open

Issues regarding discrete WavLM and discrete HuBERT #2453

anupsingh15 opened this issue Mar 6, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@anupsingh15
Copy link

anupsingh15 commented Mar 6, 2024

Describe the bug

Hi,
I am trying to extract tokens using the modules: speechbrain.lobes.models.huggingface_transformers.discrete_wavlm module and speechbrain.lobes.models.huggingface_transformers.discrete_hubert module, but neither seems to work due to missing model checkpoints in the SpeechBrain repo on HF. Could you please let me know of any workaround to get discrete tokens using WavLM/HuBERT?

Expected behaviour

Successful load of pre-trained models

To Reproduce

No response

Environment Details

No response

Relevant Log Output

No response

Additional Context

No response

@anupsingh15 anupsingh15 added the bug Something isn't working label Mar 6, 2024
@Chaanks
Copy link
Collaborator

Chaanks commented Mar 10, 2024

Hello @anupsingh15, You're right, the Kmeans HF repository is currently inaccessible due to an ongoing refactoring of the interface. A workaround could be to train your own K-means model (see SpeechBrain's LibriSpeech quantization recipe). Alternatively, I have just uploaded a list of pre-trained K-means model to my HF account (repository) that you can use until the new interface is merged.

@anupsingh15
Copy link
Author

Thanks @Chaanks. Do you plan to upload KMeans models with 1024 cluster centroids for HuBERT and WavLM? I am training the KMeans models for the same as you suggested; however, I run out of the GPU memory due to limited resources.

@Adel-Moumen
Copy link
Collaborator

Any news @Chaanks on that?

@poonehmousavi
Copy link
Collaborator

We have uploaded the models with 1000/2000 clusters for different layers in our own repo.. We plan to move all the trained kmeans to the speech \brain repo once the refactoring is done. Here you could find various K-means model trained 👍🏻 https://huggingface.co/poonehmousavi/SSL_Quantization/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants