GitHub - deepily/genie-in-the-box: Genie in the Box: Distill Whisper STT => Mistral-7B => Phind/Phind-CodeLlama-34B-v2 => GPT 3.5 => Coqui's TTS/OpenAI TTS

YOU KNOW THE DREAM

Talk to the computer, and it tells you, or does, something useful.

YOU PROBABLY KNOW THE PROBLEM

Currently, AI Agents & Chat Bots are slow and expensive. They make silly mistakes. They're forgetful. And they work too hard reinventing the wheel.

WHAT MOST PEOPLE PROBABLY DON'T REALIZE

Even the simplest of vox in & vox out UX -- especially when coupled with agentic behaviors -- is hard. It's asynchronous, and usually frustratingly slow. It's a new way of interacting with computers, which requires a global re-thinking of how different the UI control and display modalities interact.

DEEPILY HAS IS WORKING ON A SOLUTIONs

I'm working on helping Agents remember what problems they've already solved, or if they've solved something semantically synonymous or computationally analogous before.

THE RESULT

Fast, real time responses, asynchronous callbacks for big jobs, and more natural, human-like interaction. You will want to talk to your computer!

THE VIEW FROM 30,000 FT

There are two ways to answer a question when using agentic vox 2 vox: The fast, or agonizingly slow, way. THE VIEW FROM 30,000 FT The green dotted lines and boxes are the quickest way through this flow chart (Deepily.ai Agents), the red dotted lines and boxes take anywhere from 100 to 200 times longer to execute (ChatGPT & LangChain).

CURRENT FOCUS

I'm currently working on

Agentic learning (code refactoring) based on previously solved problems stored in long-term memory
Using query-to-function mapping similar to what ChatGPT is doing, and
Providing human in the loop feedback when agents go awry

THE PRESENT REALITY

I can perform basic browsing tasks with Firefox using my voice
I can edit, spellcheck and proofread documents using my voice
I can also interact with PyCharm using my voice

THE (NEAR) FUTURE PLAN: EOY 2023

Interact seamlessly, asynchronously and in real time, with calendaring and TODO list apps using my voice
Do the same with a web research assistant to replace what I'm doing manually with ChatGPT
Have my agents speak to me with any of my favorite character voices in multiple languages
Host my own internal LLM server for privacy and security

THE (FAR) FUTURE DREAM: 2024

Interact with my agents, servers & computers using my voice, and have it do what I want it to do, when & how I want it done. I'm not asking for much, am I?
Safely and securely, of course
World peace, non X, and all that too

DISCLAIMER

This Genie-in-the-box project is currently an extremely large set of working sketches which I am actively organizing & tidying up so that I can collaborate with others.

So, I'm not there yet, obviously. But I'm working on it and getting closer every day.

Interested?

Begin!

Name		Name	Last commit message	Last commit date
Latest commit History 1,330 Commits
docker		docker
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker

docker

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

YOU KNOW THE DREAM

YOU PROBABLY KNOW THE PROBLEM

WHAT MOST PEOPLE PROBABLY DON'T REALIZE

DEEPILY HAS IS WORKING ON A SOLUTIONs

THE RESULT

THE VIEW FROM 30,000 FT

CURRENT FOCUS

THE PRESENT REALITY

THE (NEAR) FUTURE PLAN: EOY 2023

THE (FAR) FUTURE DREAM: 2024

DISCLAIMER

About

Packages

Contributors 2

Languages

License

deepily/genie-in-the-box

Folders and files

Latest commit

History

Repository files navigation

YOU KNOW THE DREAM

YOU PROBABLY KNOW THE PROBLEM

WHAT MOST PEOPLE PROBABLY DON'T REALIZE

DEEPILY HAS IS WORKING ON A SOLUTIONs

THE RESULT

THE VIEW FROM 30,000 FT

CURRENT FOCUS

THE PRESENT REALITY

THE (NEAR) FUTURE PLAN: EOY 2023

THE (FAR) FUTURE DREAM: 2024

DISCLAIMER

About

Topics

Resources

License

Stars

Watchers

Forks

Languages