Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of Turn-Taking Models into Nemo Framework for Enhanced Realistic Conversations #9150

Open
rodrigoGA opened this issue May 9, 2024 · 0 comments
Assignees

Comments

@rodrigoGA
Copy link

Since Nemo is a language-focused framework, I was wondering if it's on the roadmap or if there is a possibility to work with turn-taking models.

There is a lot of literature on this, and according to the current state of the art, I believe these kinds of models are indispensable for achieving realistic conversations. These models are based on predicting whose turn it is in the conversation, which allows for more realistic interactions. Specifically:

  • It enables the decision of when to give a response to the user, and this range is dynamic, whether the user is thinking or immediately after they have finished a sentence.
  • It can detect when the user intends to interrupt what the bot is saying.
  • It also offers the possibility of backchannels, such as saying phrases like "yeah" or "uh-huh" while the user is speaking, which has been proven to result in longer conversations by the user.

My questions are as follows:

  • Is this on the roadmap?
  • There are several open-source models that implement this, for example, https://github.com/ErikEkstedt/VoiceActivityProjection. In this particular case, it is a simple PyTorch model.
    • Can this model be converted to a Nemo model?
    • In our case, we also have a commercial version of Nvidia Riva. If we convert it to Nemo, could we then convert it to Riva? Could we deploy it on the Riva server, and would it be consumable with Riva clients?

Thank you and regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants