Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-working input standardization #694

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

zsimjee
Copy link
Collaborator

@zsimjee zsimjee commented Apr 3, 2024

This does not currently work.

This pr attempts to use input/output mapping to deal with LLMs in a more generic way. What I've discovered is that there's simply too much going on with both our existing implementation as well as in individual LLM clients. I think that overhauling to this level is too big a change for the minor version update with regards to the risk and testing surface area involved.

This PR is missing a few key parts:

  1. mapping inputs for the injection of schema and reask instructions
  2. dealing with function calling
  3. testing around streaming, async

Fundamentally, I do think this approach can work, and is a better mapping solution than the one that we currently have since in some ways it's "looser". It does not include recreating clients or trying to type-match into them, which I think is a huge benefit from a maintainability perspective.

All that being said, I would propose a difference in deliverables for 0.4.3 that I think are still achievable, help with debug scenarios, and keep backwards compatability without being too huge a change. However, I only think these changes are appropriate if we do necessary pruning on 0.5.0.

0.4.3

  1. make the __call__ llm_api callable optional. When it is optional, we pass through information to a LiteLLM client we create internally
  2. expose a naked_llm_call function on guard. This function would have inputs styled the same way we style __call__ inputs, but purely uses our internal mapping to make a call to the LLM without guarding. This would help users debug and translate from their call pattern to the guard-style call pattern.

0.5.0

  1. delete all custom mappings in llm_providers other than OpenAICallable, LiteLLMCallable, and Sync/AsycnBaseCallable
  2. make clear in docs that those callables are the only supported ones. If you have a callable other than those and don't want to wrap in a BaseCallable, then orchestrate your own llm api callable and
  3. remove msg_history, prompt, instructions. Expect messages for BaseCallable. For everything else, pass through args as is
  4. in OpenAI, cohere, and anthropic, never recreate a client
  5. get rid of old openai versions across the codebase
  6. create 1 doc notebook with examples for each llm type present in llms.py
  7. support streaming and asycn for openai and litellm callables natively but through a similar output mapping strategy as found in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant