Speech-To-Text Program with Whisper API and PyAudio

Notice

This document refers to the contents of /old/ -- this project is really for personal use, so changes will be made at will for my own personal preference, none the less, it works and you can use it as you wish so long as you provide credit if it ends up elsewhere.

Overview

This is a Python program that utilizes various modules like pyautogui, speech_recognition, winsound, openai, and clipboard to capture and recognize speech through microphone, use OpenAI's Whisper ASR API to transcribe spoken words to text, and then replace the clipboard content with the text and paste it wherever needed.

Pre-requisites

Make sure you have installed the necessary python packages listed in the requirements.txt file. If not, you can install them using pip:

pip install -r requirements.txt

Optionally, you can use https://www.autohotkey.com/ and the provided autohotkey script.

Special Instructions

The script uses the function key F22 as a hotkey to mute Discord, so you'll need to bind the Discord mute to F22.

Usage

Make sure you have placed your Whisper API key in a file named openaikey.txt in the same directory as the script. Once done, simply run the script or activate the provided autohotkey script using win + h to use it or numpad div.

pythonw speech.py

Note

Any text captured is processed to correct punctuation and replace typed-out slang with text-based alternatives for a more natural reading experience.

Error Handling

In case of any exceptions / errors while recognizing speech, the script will type "error".

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
coffee		coffee
old		old
.gitignore		.gitignore
README.md		README.md
config.example.ini		config.example.ini
requirements.txt		requirements.txt
speech2.ahk		speech2.ahk
speech2.exe		speech2.exe
toggle.pyw		toggle.pyw
toggle2.pyw		toggle2.pyw

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coffee

coffee

old

old

.gitignore

.gitignore

README.md

README.md

config.example.ini

config.example.ini

requirements.txt

requirements.txt

speech2.ahk

speech2.ahk

speech2.exe

speech2.exe

toggle.pyw

toggle.pyw

toggle2.pyw

toggle2.pyw

Repository files navigation

Speech-To-Text Program with Whisper API and PyAudio

Notice

Overview

Pre-requisites

Special Instructions

Usage

Note

Error Handling

About

Releases 2

Packages

Languages

WeNeedCoffee/Speech-To-Text

Folders and files

Latest commit

History

Repository files navigation

Speech-To-Text Program with Whisper API and PyAudio

Notice

Overview

Pre-requisites

Special Instructions

Usage

Note

Error Handling

About

Topics

Resources

Stars

Watchers

Forks

Languages