EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
-
Updated
Jan 3, 2024 - Python
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Generate cursed videos with AI
Speakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.
Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features
Synthesize speech using state-of-the-art open and closed-source tools
Voice cloning using coqui-TTS
llm server using outlines for json/regex/cfg formatted generation
Rust bindings to the https://github.com/coqui-ai TTS library
A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.
Genie in the Box: Distill Whisper STT => Mistral-7B => Phind/Phind-CodeLlama-34B-v2 => GPT 3.5 => Coqui's TTS/OpenAI TTS
Text to Speech using Coqui TTS + RVC
Persian/Farsi text to speech(TTS) training using coqui tts
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
Add a description, image, and links to the coqui-tts topic page so that developers can more easily learn about it.
To associate your repository with the coqui-tts topic, visit your repo's landing page and select "manage topics."