Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Add more automatic configuration for LangChain LLMs for Python processors #707

Open
devinbost opened this issue Nov 9, 2023 · 2 comments

Comments

@devinbost
Copy link
Collaborator

devinbost commented Nov 9, 2023

From 0.4.3+, information specified in the configuration YAML is automatically available to the context object in Python processors.
This feature is extremely helpful.
However, it doesn't cover every case.
For example, to switch between using OpenAI vs Azure OpenAI for the LLM in LangChain implementations, there's a different wrapper that must be provided.

import os
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
os.environ["OPENAI_API_BASE"] = "..."
os.environ["OPENAI_API_KEY"] = "..."

from langchain.llms import AzureOpenAI
llm = AzureOpenAI(
    deployment_name="td2",
    model_name="text-davinci-002",
)

(https://python.langchain.com/docs/integrations/llms/azure_openai#deployments)

We run into a similar issue with WatsonX. (In https://ibm.github.io/watson-machine-learning-sdk/fm_extensions.html you can see that WatsonxLLM(model=model) must be declared.)

In these cases, to configure the env variables, we still need to either:

  • map them in the pipeline YAML file (which is a potential source of mapping errors)
  • use os.enviroment in Python (which is worse since it's also subject to the same human error but now adds more code to maintain and can't be validated as easily at build time)

Then, we also need to ensure that the correct LLM wrapper is instantiated in the Python code.
It would be quite useful if LangStream could simplify this by providing something like:
llm = context.buildLLM()

In LangChain, there is a small tree now of LLM classes, but most of them appear to implement either BaseLanguageModel or BaseLLM.

For example:

  • AzureChatOpenAI < ChatOpenAI < BaseChatModel < BaseLanguageModel[BaseMessage]
  • AzureOpenAI < BaseOpenAI < BaseLLM < BaseLanguageModel[str]

Different implementations require different env vars to be set, so it is quite annoying to need to manually keep track of them all, especially when needing to switch between providers.

Perhaps we can leverage this to add additional no-code configuration to add more value.

@devinbost
Copy link
Collaborator Author

One idea is that we could provide a Mixin or Decorator that automatically wires up certain properties based on the configuration specified. That way, if someone wants a different behavior, there's a path to customize it without introducing breaking changes.

@devinbost
Copy link
Collaborator Author

I noticed that the various chat models are all grouped in this folder: https://github.com/devinbost/langchain/tree/d266b3ea4a5dc85fe895b93090d89aa311f8c48e/libs/langchain/langchain/chat_models

For conversational memory and other such features, they're extremely useful, and I'd expect the list of them to grow quickly. The base LLMs are more primitive models, so often once someone gets beyond the initial case of API integration with a base model, they then want things like conversational memory, and that's where the chat models are really useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant