GitHub - KoljaB/Linguflex: Command Your World with Voice

Bringing the sci-fi dream of a Jarvis-style AI companion into reality.

Linguflex 2.0

Born out of my passion for science fiction, this project aims to simulate engaging, authentic, human-like interaction with AI personalities.

It offers voice-based conversation with custom characters, alongside an array of practical features: controlling smart home devices, playing music, searching the internet, fetching emails, displaying current weather information and news, assisting in scheduling, and searching or generating images.

I invite you to explore the framework, whether you're a user seeking an innovative AI experience or a fellow developer interested in the project. All insights, suggestions, and contributions are appreciated. I want to bring this personal passion project towards its full potential, hopefully with the community's assistance, to collectively contribute to the evolution of AI.

📓 Linguflex 2.0 installation
🎥 Installation video guide
🎥 See in action (short clip)

Key Features

Ultra-Low Latency: Every aspect of Linguflex was fine-tuned to minimize response times, achieving unparalleled speed in both language model communication and text-to-speech (TTS) generation.
Local Operation: Full functionality is maintained locally, encompassing speech-to-text, TTS, and language model inference, ensuring privacy and reliability.
High-Quality Audio: Integrating advanced voice clone technology for real-time post-processing, Linguflex offers a near-Elevenlabs quality in local TTS synthesis.
Enhanced Functionality: Streamlined function selection allows Linguflex to quickly adapt and respond to a wide range of text-based commands and queries.
Developer-Friendly: Building new modules is more intuitive and efficient, thanks to the minimalistic and clear coding framework.

Modules

Core Modules

Listen (Audio Input Module): Serving as Linguflex's auditory system, this module captures spoken instructions via the microphone with precision.
Brain: Cognitive Processing Module. Heart of Linguflex, processes user input, either with a local language model or OpenAI GPT API.
Speech (Audio Output Module): Offers realtime TTS with various provider options, and advanced voice tuning capabilities, including Realtime Voice Cloning (RVC).

Current Expansion Modules

Mimic: This creative tool allows users to design custom AI characters, assign unique voices created with the Speech module, and switch between them.
Music: A voice-command module for playing selected songs or albums, enhancing the user experience with musical integration.
Mail: Retrieves emails via IMAP, integrating with your digital correspondence.
Weather: Provides current weather data and forecasts, adapting to your location.
House: Smart Home control for Tuya-compatible devices, enhancing your living experience.
Calendar: Manages personal calendars and appointments, including Google Calendar integration.
Search: Performs text and image searches using the Google Search API.
Server: Webserver functionality to connect external devices like smartphones etc.

Modules Coming Soon

See: Empower the assistant with visual capabilities using the GPT Vision API. Enables processing of webcam pictures and desktop screenshots.
Memory: Stores and retrieves JSON-translatable data.
News: Delivers compact summaries of current news.
Finance: Offers financial management integrating various financial APIs for real-time tracking of investments.
Create: Image generation using DALL-E API, turning text prompts into vivid images.

Getting Started

Follow the Modules Guide for step-by-step instructions about how to set up and configure the Linguflex modules.

License

The codebase is under MIT License and the TTS model weights are under the individual TTS engine licenses listed below:

CoquiEngine

License: Open-source only for noncommercial projects.
Commercial Use: Requires a paid plan.
Details: CoquiEngine License

ElevenlabsEngine

License: Open-source only for noncommercial projects.
Commercial Use: Available with every paid plan.
Details: ElevenlabsEngine License

AzureEngine

License: Open-source only for noncommercial projects.
Commercial Use: Available from the standard tier upwards.
Details: AzureEngine License

SystemEngine

License: Mozilla Public License 2.0 and GNU Lesser General Public License (LGPL) version 3.0.
Commercial Use: Allowed under this license.
Details: SystemEngine License

OpenAIEngine

License: please read OpenAI Terms of Use

Name		Name	Last commit message	Last commit date
Latest commit History 393 Commits
docs		docs
lingu		lingu
static		static
.gitignore		.gitignore
README.md		README.md
_install_win.bat		_install_win.bat
_start_linguflex.bat		_start_linguflex.bat
_start_venv.bat		_start_venv.bat
download_models.py		download_models.py
install_win.py		install_win.py
requirements.txt		requirements.txt
run.bat		run.bat

KoljaB/Linguflex

Folders and files

Latest commit

History

Repository files navigation

Linguflex 2.0

Key Features

Core Modules

Current Expansion Modules

Modules Coming Soon

Getting Started

License

CoquiEngine

ElevenlabsEngine

AzureEngine

SystemEngine

OpenAIEngine

About

Topics

Resources

Stars

Watchers

Forks

Languages