🚀 MemGPT Q2 2024 Developer Roadmap #1200

cpacker · 2024-03-31T22:33:07Z

Q2 2024 Roadmap

🚀 Link to GitHub project board tracking all of the roadmap items
👋 Looking for smaller things to work on? Check the community contributions or bug tracker project boards
✍️ Leave comments or message us on Discord to suggest changes to the roadmap

More MemGPT LLM backends / API supported [early April]

Groq
- feat: added groq support via local option w/ auth #1203
- TODO via beta function calling API
  - feat: groq support via official tool-calling API #1257
Claude
- feat: Anthropic Claude API support #1239
Cohere
- feat: add Cohere API support (Command-R+) #1246
Gemini
- feat: add Google AI Gemini Pro support #1209
Together
Mistral
Add unit tests for all the officially supported API providers (with reference configs contained in main) to avoid regression
- OpenAI, Anthropic, Groq, Together, Mistral, Gemini
Consolidate inference backend options into a single config style:
- i.e.: openai-chat-completions (= OpenAI, Azure, APIs with function calling), openai-completions (= vLLM, lmstudio, ollama), ...

MemGPT developer examples

Native multi-agent interaction (via multiple MemGPT agents running on a MemGPT server process)
- Example "agent orchestrator" meta-agent where one agent controls send_message calls to other agents
  - in this example, all the agents share a groupchat state similar to AutoGen
- Example free-chat where each agent can freely broadcast send_message calls to other agents
  - need to handle "race conditions" where agent is busy (the calling agent should receive an informative message reply)
  - in this example, the only shared state between agents is via communication on send_message calls
Discord / Slack chatbot examples (connecting send_message to external APIs)
- Example Discord + Slack + Twilio send_message tool + message listen hook (needs a dedicated section in dev portal)
GitHub / Discord support chatbot examples (e.g. Dosubot)
- Example read_issue, list_issues + comment-posted-to-API-call hooks

MemGPT server

One-click deployment container [early April]
Instructions / tutorial for deploying MemGPT server to Azure/GCP/AWS/... [mid April]

MemGPT API

Token streaming support [mid-April]
Production-ready stable API [mid-April]

OpenAI Assistants API

Continued improvements to OpenAPI Assistants API support: OpenAI Assistant API Compatibility Tracker #892

MemGPT Client / SDK

Javascript/Typescript client

Developer portal / Chat UI

Alpha release [early April]
Addition of missing features for beta release in MemGPT v0.4 [early April]
- Preset creation / editing
- Custom function / tool creation (via the UI)
- Message (user+system+assistant) editing + rerunning / regenerating messages
- Custom prompt formatting
Cron job scheduling inside of dev portal
- Make it easy to schedule automated jobs that hit the MemGPT server
- Cron-style custom functions: MemGPT can schedule one off or recurring messages to itself, ie ‘Scheduled Inner Monologue’

Hosted service

Hosted service [end of April]
- Release hosted MemGPT server so that developers can directly interact with the API
Allows developers to use the MemGPT API without requiring any self-hosting (just API keys)
- Release hosted chat UI app (with guest + login modes) to allow easy use / experimentation with MemGPT via chat UI only
- Accounts are shared with the hosted API server (allows interacting with the same agents via hosted API + hosted chat UI)

⚡ Streaming (token-level) support

Add streaming support for CLI interface with OpenAI-compatible endpoints
- feat: add streaming support for OpenAI-compatible endpoints #1262
Allow streaming back of POST requests (to MemGPT API / server)
- feat: add token streaming to the MemGPT API #1280
- In MemGPT function calling setup, this likely means:
  - Stream back inner thoughts first
  - Then stream back function call
  - Then attempt to parse function call (validate if final full streamed response was OK)

Miscellaneous features (Q2+)

👥 Split thread agent

Support alternate “split-thread” MemGPT agent architecture
SplitThreadAgent-v0.1 runs two “prompt” / “context” threads
- DialogueThread that generates conversations and calls utility functions (e.g. run_google_search(...) or call_smart_home(...))
- MemoryThread that is a passive reader of the ongoing conversation, and is responsible for memory edits, insertion, and search
  - core_memory_replace , core_memory_append
  - archival_memory_search, archival_memory_insert
  - conversation_search, conversation_search_date
  - Question: should these be usable by the DialogueThread too?

🦙 Specialized MemGPT models

Release (on HuggingFace) and serve (via the free endpoint) models that have been fine-tuned specifically for MemGPT
- “MemGPT-LM-8x7B-v0.1” (e.g. Mixtral 8x7B fine-tuned on MemGPT data w/ DPO)
- Goal is to bridge the gap between open models and GPT-4 for MemGPT performance

👁️ Multi-modal support

Start with gpt-4-vision support first to work out the necessary refactors required
- Will require modifications to the current data_types stored in the database
Work backwards to LLaVA

👾 Make MemGPT a better coding assistant

Coding requires some coding-specific optimizations
- Better support for generating coding blocks with parsing errors
- Add specific grammars / model wrappers for coding
Add support for block-level code execution
- CodeInterpreter style

📄 Make MemGPT a better document research assistant

Add more complex out-of-the-box archival_memory_search replacements
- e.g. using LlamaIndex RAG pipelines

🔧 Better default functions

E.g. better out-of-the-box internet search

⏱️ Asynchronous tool use support

Support non-blocking tool use (Feature Request: sending message to agent without user role #1062)
- E.g. image generation that takes ~10s+ should not block the main conversation thread
- Implement by returning the "TBD" tool response immediately, then inject full response later

🧠 Better memory systems

core_memory_v2, archival_memory_v2, etc.
- e.g. core_memory_v2
  - add more structure (insertions are key, value style only)
- e.g. archival_memory_v2
  - add more metadata tagging at insertion time
    - type: [memory, feeling, observation, reflection, …]
  - add an asynchronous “memory consolidation” loop
    - every N minutes (or once per day), a task runner starts that tries to consolidate all the archival memories
  - add more structure in the storage
    - not just a vector database
    - knowledge graph?
    - hierarchical storage?

The text was updated successfully, but these errors were encountered:

atljoseph · 2024-05-11T23:17:28Z

Hi, just wanted to say this is an awesome project, and that the ability to have a fully featured ui to get started is really important to me. Nice to see it on the roadmap. That along with non-trivial examples makes it an easy choice to go with memGPT.

cpacker mentioned this issue Mar 31, 2024

🚀 MemGPT Q1 2024 Developer Roadmap #1044

Closed

13 tasks

cpacker added the roadmap Planned features label Mar 31, 2024

cpacker pinned this issue Mar 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 MemGPT Q2 2024 Developer Roadmap #1200

🚀 MemGPT Q2 2024 Developer Roadmap #1200

cpacker commented Mar 31, 2024 •

edited

atljoseph commented May 11, 2024

🚀 MemGPT Q2 2024 Developer Roadmap #1200

🚀 MemGPT Q2 2024 Developer Roadmap #1200

Comments

cpacker commented Mar 31, 2024 • edited

Q2 2024 Roadmap

More MemGPT LLM backends / API supported [early April]

MemGPT developer examples

MemGPT server

MemGPT API

OpenAI Assistants API

MemGPT Client / SDK

Developer portal / Chat UI

Hosted service

⚡ Streaming (token-level) support

Miscellaneous features (Q2+)

👥 Split thread agent

🦙 Specialized MemGPT models

👁️ Multi-modal support

👾 Make MemGPT a better coding assistant

📄 Make MemGPT a better document research assistant

🔧 Better default functions

⏱️ Asynchronous tool use support

🧠 Better memory systems

atljoseph commented May 11, 2024

cpacker commented Mar 31, 2024 •

edited