Sharif_WavLM

for Speaker Verification

In this repository, the wavLM model is used for quality and poor-quality data for speaker verification tasks, and the PyCM library is used for evaluation.

General Info

Datasets: In this review, 30 speakers have been selected from the Farsdat Dataset, 10 speakers is chosen for test as unknows and the rest of speakers as known (target/untarget) each speakers has 10 audio files we use the first audio file as Enrollment file audio files should be 6 secs (here we use ffmpeg to cut them)
Evaluation: For the evaluation part, the PyCM library has been used, which is a reliable and comprehensive library and supports many metrics PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and accurate evaluation of a large variety of classifiers.
System Config: To fine-tune this model, NVIDIA GeForce RTX 3060-12 GB is used.
link to model: https://huggingface.co/SaraSadeghi/Sharif-WavLM

How to Use

for high-quality(microphone) data: use WavLM_base_AGP for poor-quality(telephony) data: use WavLM_base_telephony

Comparison

Loading .... :hourglass_flowing_sand:

Useful Links

Base Model:https://huggingface.co/microsoft/wavlm-base-plus-sd
Base Paper:https://arxiv.org/abs/2110.13900
PyCM

Thanks to

Thanks to Sadra Sabouri for his collaboration:handshake::handshake:

and also thanks to PyCM🔥🔥

⭐Give us a star if you found this repo useful.

🙋‍♀️ Open an issue if you have any comments about them.

🥰 Feel free to open a pull request addding your feature. We'll be more than happy to accept them.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
WavLM_base_AGP.ipynb		WavLM_base_AGP.ipynb
WavLM_base_telephony.ipynb		WavLM_base_telephony.ipynb
WavLM_report		WavLM_report
actual_vector.npy		actual_vector.npy
actual_vector_telephony.npy		actual_vector_telephony.npy
similarities.npy		similarities.npy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

WavLM_base_AGP.ipynb

WavLM_base_AGP.ipynb

WavLM_base_telephony.ipynb

WavLM_base_telephony.ipynb

WavLM_report

WavLM_report

actual_vector.npy

actual_vector.npy

actual_vector_telephony.npy

actual_vector_telephony.npy

similarities.npy

similarities.npy

Repository files navigation

Sharif_WavLM

for Speaker Verification

Table of Contents

General Info

How to Use

Comparison

Useful Links

Thanks to

About

Releases

Packages

Languages

License

Sarasadeghii/Sharif-WavLM

Folders and files

Latest commit

History

Repository files navigation

Sharif_WavLM

for Speaker Verification

Table of Contents

General Info

How to Use

Comparison

Useful Links

Thanks to

About

Topics

Resources

License

Stars

Watchers

Forks

Languages