Skip to content

KoljaB/Linguflex

Repository files navigation

Bringing the sci-fi dream of a Jarvis-style AI companion into reality.

Discord YouTube Twitter

Linguflex 2.0

Born out of my passion for science fiction, this project aims to simulate engaging, authentic, human-like interaction with AI personalities.

It offers voice-based conversation with custom characters, alongside an array of practical features: controlling smart home devices, playing music, searching the internet, fetching emails, displaying current weather information and news, assisting in scheduling, and searching or generating images.

I invite you to explore the framework, whether you're a user seeking an innovative AI experience or a fellow developer interested in the project. All insights, suggestions, and contributions are appreciated. I want to bring this personal passion project towards its full potential, hopefully with the community's assistance, to collectively contribute to the evolution of AI.


📓 Linguflex 2.0 installation
🎥 Installation video guide
🎥 See in action (short clip)


Key Features

  • Ultra-Low Latency: Every aspect of Linguflex was fine-tuned to minimize response times, achieving unparalleled speed in both language model communication and text-to-speech (TTS) generation.
  • Local Operation: Full functionality is maintained locally, encompassing speech-to-text, TTS, and language model inference, ensuring privacy and reliability.
  • High-Quality Audio: Integrating advanced voice clone technology for real-time post-processing, Linguflex offers a near-Elevenlabs quality in local TTS synthesis.
  • Enhanced Functionality: Streamlined function selection allows Linguflex to quickly adapt and respond to a wide range of text-based commands and queries.
  • Developer-Friendly: Building new modules is more intuitive and efficient, thanks to the minimalistic and clear coding framework.

Core Modules

  • Listen (Audio Input Module): Serving as Linguflex's auditory system, this module captures spoken instructions via the microphone with precision.
  • Brain: Cognitive Processing Module. Heart of Linguflex, processes user input, either with a local language model or OpenAI GPT API.
  • Speech (Audio Output Module): Offers realtime TTS with various provider options, and advanced voice tuning capabilities, including Realtime Voice Cloning (RVC).

Current Expansion Modules

  • Mimic: This creative tool allows users to design custom AI characters, assign unique voices created with the Speech module, and switch between them.
  • Music: A voice-command module for playing selected songs or albums, enhancing the user experience with musical integration.
  • Mail: Retrieves emails via IMAP, integrating with your digital correspondence.
  • Weather: Provides current weather data and forecasts, adapting to your location.
  • House: Smart Home control for Tuya-compatible devices, enhancing your living experience.
  • Calendar: Manages personal calendars and appointments, including Google Calendar integration.
  • Search: Performs text and image searches using the Google Search API.
  • Server: Webserver functionality to connect external devices like smartphones etc.

Modules Coming Soon

  • See: Empower the assistant with visual capabilities using the GPT Vision API. Enables processing of webcam pictures and desktop screenshots.
  • Memory: Stores and retrieves JSON-translatable data.
  • News: Delivers compact summaries of current news.
  • Finance: Offers financial management integrating various financial APIs for real-time tracking of investments.
  • Create: Image generation using DALL-E API, turning text prompts into vivid images.

Getting Started

Follow the Modules Guide for step-by-step instructions about how to set up and configure the Linguflex modules.

License

The codebase is under MIT License and the TTS model weights are under the individual TTS engine licenses listed below:

CoquiEngine

  • License: Open-source only for noncommercial projects.
  • Commercial Use: Requires a paid plan.
  • Details: CoquiEngine License

ElevenlabsEngine

  • License: Open-source only for noncommercial projects.
  • Commercial Use: Available with every paid plan.
  • Details: ElevenlabsEngine License

AzureEngine

  • License: Open-source only for noncommercial projects.
  • Commercial Use: Available from the standard tier upwards.
  • Details: AzureEngine License

SystemEngine

  • License: Mozilla Public License 2.0 and GNU Lesser General Public License (LGPL) version 3.0.
  • Commercial Use: Allowed under this license.
  • Details: SystemEngine License

OpenAIEngine