Skip to content

Releases: wenet-e2e/wenet

v3.1.0

23 May 03:25
2d8bb97
Compare
Choose a tag to compare

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

What's Changed

New modules and methods (from LLM community) by @Mddct & @fclearner 🤩🤩🤩

  • [transformer] support multi query attention && multi goruped by @Mddct in #2403
  • [transformer] add rope for transformer/conformer by @Mddct in #2458
  • LoRA support by @fclearner in #2049

New Contributors

Full Changelog: v3.0.1...v3.1.0

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

WeNet 3.0.1

09 Mar 06:50
a93af33
Compare
Choose a tag to compare

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

What's Changed

  • Fix loss returned by CTC model in RNNT by @kobenaxie in #2327
  • [dataset] new io for code reuse for many speech tasks by @Mddct in #2316
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • Fix eot by @Qiaochu-Song in #2330
  • [decode] support length penalty by @xingchensong in #2331
  • [bin] limit step when averaging model by @xingchensong in #2332
  • fix 'th_accuracy' not in transducer by @DaobinZhu in #2337
  • [dataset] support bucket by seq length by @Mddct in #2333
  • [examples] remove useless yaml by @xingchensong in #2343
  • [whisper] support arbitrary language and task by @xingchensong in #2342
    • (!! breaking changes, happy whisper happy life !!) 💯💯💯
  • Minor fix decode_wav by @kobenaxie in #2340
  • fix comment by @Mddct in #2344
  • [w2vbert] support w2vbert fbank by @Mddct in #2346
  • [dataset ] fix typo by @Mddct in #2347
  • [wenet] fix args.enc by @Mddct in #2354
  • [examples] Initial whisper results on wenetspeech by @xingchensong in #2356
  • [examples] fix --penalty by @xingchensong in #2358
  • [paraformer] add decoding args by @xingchensong in #2359
  • [transformer] support flash att by 'torch scaled dot attention' by @Mddct in #2351
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • [conformer] support flash att by torch sdpa by @Mddct in #2360
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • [conformer] sdpa default to false by @Mddct in #2362
  • [transformer] fix bidecoder sdpa by @Mddct in #2368
  • [runtime] Configurable blank token idx by @zhr1201 in #2366
  • [wenet] modify - runtime/code/decoder more faster by @Sang-Hoon-Pakr in #2367
    • (!! Significant improvement on warmup when using libtorch !!) 🚀🚀🚀
  • [lint] fix lint by @cdliang11 in #2373
  • [examples] better results on wenetspeech using revised transcripts by @xingchensong in #2371
    • (!! Significant improvement on results of whisper !!) 💯💯💯
  • [dataset] support pad or trim for whisper decoding by @Mddct in #2378
  • [bin/recognize.py] support numworkers and compute dtype by @Mddct in #2379
    • (!! Significant improvement on inference speed when using fp16 !!) 🚀🚀🚀
  • [whisper] fix decoding maxlen by @Mddct in #2380
  • fix whisper ckpt modify error by @fclearner in #2381
  • 更新 recognize.py by @Mddct in #2383
  • [transformer] add cross attention by @Mddct in #2388
    • (!! Significant improvement on inference speed of attention_beam_search !!) 🚀🚀🚀
  • [paraformer] fix some bugs by @Mddct in #2389
  • new modules and methods by @Mddct in 🤩🤩🤩

New Contributors

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

Full Changelog: v3.0.0...v3.0.1

WeNet 3.0.0

25 Jan 11:59
baaa27a
Compare
Choose a tag to compare

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

New Features

What's Changed

  • Upgrade libtorch CPU runtime with IPEX version #1893
  • Refine ctc alignment #1966
  • Use torchrun for distributed training #2020, #2021
  • Refine traning code #2055, #2103, #2123, #2248, #2252, #2253, #2270, #2286, #2288, #2312 (!! big changes !!) 🚀
  • mv all ctc functions to ctc_utils.py #2057 (!! big changes !!) 🚀
  • move search methods to search.py #2056 (!! big changes !!) 🚀
  • move all k2 related functions to k2 #2058
  • refactor and simplify decoding methods #2061, #2062
  • unify decode results of all decoding methods #2063
  • refactor(dataset): return dict instead of tuple #2106, #2111
  • init_model API changed #2116, #2216 (!! big changes !!) 🚀
  • move yaml saving to save_model() #2156
  • refine tokenizer #2165, #2186 (!! big changes !!) 🚀
  • deprecate wenetruntime #2194 (!! big changes !!) 🚀
  • use pre-commit to auto check and lint #2195
  • refactor(yaml): Config ctc/cmvn/tokenizer in train.yaml #2205, #2229, #2230, #2227, #2232 (!! big changes !!) 🚀
  • train with dict input #2242, #2243 (!! big changes !!) 🚀
  • [dataset] keep pcm for other task #2268
  • Updgrad torch to 2.x #2301 (!! big changes !!) 🚀
  • log everything to tensorboard #2307

New Bug Fixes

  • Fix NST recipe #1863
  • Fix Librispeech fst dict #1929
  • Fix bug when make shard.list for *.flac #1933
  • Fix bug of transducer #1940
  • Avoid problem during model averaging when there is parameter-tying. #2113
  • [loss] set zero_infinity=True to ignore NaN or inf ctc_loss #2299
  • fix android #2303

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
Many thanks to all the contributors !!!!! I love u all.

WeNet 2.2.1

26 May 05:08
ac9a261
Compare
Choose a tag to compare

What's Changed

WeNet 2.2.0

15 Jan 09:55
4870a53
Compare
Choose a tag to compare

What's Changed

WeNet 2.1.0

25 Nov 13:06
73ac0e2
Compare
Choose a tag to compare

What's Changed

WeNet Python Binding Models

21 Jun 10:17
bda6c86
Compare
Choose a tag to compare
Pre-release

This release is for hosting the wenet python binding models.

WeNet 2.0.0

14 Jun 01:38
d2c41cb
Compare
Choose a tag to compare

The following features are stable.

  • U2++ framework for better accuracy
  • n-gram + WFST language model solution
  • Context biasing(hotword) solution
  • Very big data training support with UIO
  • More dataset support, including WenetSpeech, GigaSpeech, HKUST and so on.

WeNet 1.0.0

21 Jun 07:27
7f00996
Compare
Choose a tag to compare

Model

  • propose and support U2++, as the following graph shows, which uses both forward and backward information at training and decoding.

image

  • support dynamic left chunk training and decoding, so we can limit history chunk at decoding to save memory and computation.
  • support distributed training.

Dataset

Now we support the following five standard speech datasets, and we got SOTA result or close to SOTA result.

数据集 语言 数据量(h) 测试集 CER/WER SOTA
aishell-1 中文 200 test 4.36 4.36(WeNet)
aishell-2 中文 1000 test_ios 5.39 5.39(WeNet)
multi-cn 中文 2385 / / /
librispeech 英文 1000 test_clean 2.66 2.10(EspNet)
gigaspeech 英文 10000 test 11.0 10.80(EspNet)

Productivity

Here are some features related to productivity.

  • LM support. Here is the system design or LM supporting. WeNet can work with/without LM according to your applications/scenarios.

image

  • timestamp support.
  • n-best support.
  • endpoint support.
  • gRPC support
  • further refine x86 server and on-device android recipe.

WeNet 0.1.0

04 Feb 08:15
a332081
Compare
Choose a tag to compare

Major Features

  • Joint CTC/AED model structure
  • U2, dynamic chunk training support
  • Torchaudio support
  • Runtime x86 and android support