dynamically set data name for auxiliary asr tasks #5697
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What?
Auxiliary data ASR data tags caused an error because they all get the name "text", which is already used for the regular ASR output. After this change, the data name is the name taken from the argument.
Before this change, I receive the error below when I try to run the fleurs recipe. After the change, I can succesfully run it without making any adaptations to the recipe.
Why?
The
asr.sh
script of theasr1
task accepts a--auxiliary_data_tags
argument in order to define auxiliary text data inputs. Specifically, the fleurs example makes use of this for an auxiliary language identification task. Currently this argument is broken because the data name is hardcoded to "text" instead of the intended data name. The "text" data name is already used for the asr output text and the script will complain about the duplicated data name:See also
The broken feature was introduced over a year ago in #4756. It is not clear to me whether the issue was caused by untested code or by an internal Espnet change at a later date.