You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use an external LLM to rescore the results of beam search from Conformer-CTC model.
When trying to get the beam search results with the eval_beamsearch_ngram_ctc.pywithout passing the N-gram LM, I get the following error:
Traceback (most recent call last):
File "/content/NeMo/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py", line 415, in main
candidate_wer, candidate_cer = beam_search_eval(
File "/content/NeMo/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py", line 196, in beam_search_eval
_, beams_batch = decoding.ctc_decoder_predictions_tensor(
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/ctc_decoding.py", line 319, in ctc_decoder_predictions_tensor
hypotheses_list = self.decoding(
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/ctc_beam_decoding.py", line 166, in __call__
return self.forward(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/nemo/core/classes/common.py", line 1098, in __call__
outputs = wrapped(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/ctc_beam_decoding.py", line 280, in forward
hypotheses = self.search_algorithm(prediction_tensor, out_len)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/ctc_beam_decoding.py", line 314, in default_beam_search
raise FileNotFoundError(
FileNotFoundError: KenLM binary file not found at : None. Please set a valid path in the decoding config.
Steps/Code to reproduce bug
Install decoders.
NEMO_PATH=<insert absolute path to NeMo directory>cd$NEMO_PATH&& bash scripts/asr_language_modeling/ngram_lm/install_beamsearch_decoders.sh $NEMO_PATH
Run the beam search with the following config:
python3 $NEMO_PATH/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py \
nemo_model_file="<nemo CTC ASR model, e.g. stt_en_conformer_ctc_medium.nemo>" \
input_manifest="<manifest json file>" \
preds_output_folder="<output directory>" \
decoding_mode=beamsearch \
decoding_strategy="beam"
Expected behavior
I would expect the error to not be thrown as BeamSearchDecoderWithLM actually handles the case when the path to N-gram LM is not passed:
# from nemo/collections/asr/modules/beam_search_decoder.pyiflm_pathisnotNone:
self.scorer=Scorer(alpha, beta, model_path=lm_path, vocabulary=vocab)
else:
self.scorer=None
When I removed the check for the KenLM file path from nemo/collections/asr/parts/submodules/ctc_beam_decoding.py, it worked:
# Check for filepathifself.kenlm_pathisNoneornotos.path.exists(self.kenlm_path):
raiseFileNotFoundError(
f"KenLM binary file not found at : {self.kenlm_path}. "f"Please set a valid path in the decoding config."
)
Environment overview
Environment location: Google Colab
Method of NeMo install: python -m pip install git+https://github.com/NVIDIA/NeMo.git@v1.23.0#egg=nemo_toolkit[all]
Environment details
OS version: Ubuntu 22.04.4 LTS
PyTorch version: 2.2.1+cu121
Python version: 3.10
Additional context
GPU: T4
The text was updated successfully, but these errors were encountered:
Update: We observed couple of code changes required with this script due to recent updates during the model and transcription refactoring. @karpov-nick is working to provide a fix for this.
Describe the bug
I am trying to use an external LLM to rescore the results of beam search from Conformer-CTC model.
When trying to get the beam search results with the
eval_beamsearch_ngram_ctc.py
without passing the N-gram LM, I get the following error:Steps/Code to reproduce bug
Expected behavior
I would expect the error to not be thrown as
BeamSearchDecoderWithLM
actually handles the case when the path to N-gram LM is not passed:When I removed the check for the KenLM file path from
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py
, it worked:Environment overview
python -m pip install git+https://github.com/NVIDIA/NeMo.git@v1.23.0#egg=nemo_toolkit[all]
Environment details
Additional context
GPU: T4
The text was updated successfully, but these errors were encountered: