Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate a better speech-to-text module #23

Open
gcampax opened this issue Apr 18, 2019 · 1 comment
Open

Investigate a better speech-to-text module #23

gcampax opened this issue Apr 18, 2019 · 1 comment

Comments

@gcampax
Copy link
Contributor

gcampax commented Apr 18, 2019

Bing Speech is good but not exceptional quality, and the streaming API is going down, leaving only the REST API which has no feedback while the user speaks and is also quite slow.

We should investigate alternative software, such as Mozilla's deepspeech. If necessary, we can run our own servers, which is probably a good idea for privacy anyway.

@12people
Copy link

Open AI just released its high-quality speech recognition model under the MIT license: https://github.com/openai/whisper .

It offers several model sizes with different system requirements, but all potentially usable offline. The ideal solution might be for Almond to ship with a smaller model (or feature a setting allowing users to choose to download an offline model) and run a server with a larger model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants