Releases: SYSTRAN/faster-whisper
faster-whisper 1.0.2
-
Add support for distil-large-v3 (#755)
The latest Distil-Whisper model, distil-large-v3, is intrinsically designed to work with the OpenAI sequential algorithm. -
Benchmarks (#773)
Introduces functionality to measure benchmarking for memory, Word Error Rate (WER), and speed in Faster-whisper. -
Support initializing more whisper model args (#807)
-
Small bug fix:
-
New feature from original openai Whisper project:
faster-whisper 1.0.1
faster-whisper 1.0.0
-
Support distil-whisper model (#557)
Robust knowledge distillation of the Whisper model via large-scale pseudo-labelling.
For more detail: https://github.com/huggingface/distil-whisper -
Upgrade ctranslate2 version to 4.0 to support CUDA 12 (#694)
-
Upgrade PyAV version to 11.* to support Python3.12.x (#679)
-
Small bug fixes
-
New improvements from original OpenAI Whisper project
faster-whisper 0.10.1
Fix the broken tag v0.10.0
faster-whisper 0.10.0
- Support "large-v3" model with
- The ability to load
feature_size/num_mels
and other frompreprocessor_config.json
- A new language token for Cantonese (
yue
)
- The ability to load
- Update
CTranslate2
requirement to include the latest version 3.22.0 - Update
tokenizers
requirement to include the latest version 0.15 - Change the hub to fetch models from Systran organization
faster-whisper 0.9.0
- Add function
faster_whisper.available_models()
to list the available model sizes - Add model property
supported_languages
to list the languages accepted by the model - Improve error message for invalid
task
andlanguage
parameters - Update
tokenizers
requirement to include the latest version 0.14
faster-whisper 0.8.0
Expose new transcription options
Some generation parameters that were available in the CTranslate2 API but not exposed in faster-whisper:
repetition_penalty
to penalize the score of previously generated tokens (set > 1 to penalize)no_repeat_ngram_size
to prevent repetitions of ngrams with this size
Some values that were previously hardcoded in the transcription method:
prompt_reset_on_temperature
to configure after which temperature fallback step the prompt with the previous text should be reset (default value is 0.5)
Other changes
- Fix a possible memory leak when decoding audio with PyAV by forcing the garbage collector to run
- Add property
duration_after_vad
in the returnedTranscriptionInfo
object - Add "large" alias for the "large-v2" model
- Log a warning when the model is English-only but the
language
parameter is set to something else
faster-whisper 0.7.1
- Fix a bug related to
no_speech_threshold
: when the threshold was met for a segment, the next 30-second window reused the same encoder output and was also considered as non speech - Improve selection of the final result when all temperature fallbacks failed by returning the result with the best log probability
faster-whisper 0.7.0
Improve word-level timestamps heuristics
Some recent improvements from openai-whisper are ported to faster-whisper:
- Squash long words at window and sentence boundaries (openai/whisper@255887f)
- Improve timestamp heuristics (openai/whisper@f572f21)
Support download of user converted models from the Hugging Face Hub
The WhisperModel
constructor now accepts any repository ID as argument, for example:
model = WhisperModel("username/whisper-large-v2-ct2")
The utility function download_model
has been updated similarly.
Other changes
- Accept an iterable of token IDs for the argument
initial_prompt
(useful to include timestamp tokens in the prompt) - Avoid computing higher temperatures when
no_speech_threshold
is met (same as openai/whisper@e334ff1) - Fix truncated output when using a prefix without disabling timestamps
- Update the minimum required CTranslate2 version to 3.17.0 to include the latest fixes
faster-whisper 0.6.0
Extend TranscriptionInfo
with additional properties
all_language_probs
: the probability of each language (only set whenlanguage=None
)vad_options
: the VAD options that were used for this transcription
Improve robustness on temporary connection issues to the Hugging Face Hub
When the model is loaded from its name like WhisperModel("large-v2")
, a request is made to the Hugging Face Hub to check if some files should be downloaded.
It can happen that this request raises an exception: the Hugging Face Hub is down, the internet is temporarily disconnected, etc. These types of exception are now catched and the library will try to directly load the model from the local cache if it exists.
Other changes
- Enable the
onnxruntime
dependency for Python 3.11 as the latest version now provides binary wheels for Python 3.11 - Fix occasional
IndexError
on empty segments when usingword_timestamps=True
- Export
__version__
at the module level - Include missing requirement files in the released source distribution