-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Exception During Model Logging with Custom Ollama Class #11962
Comments
@sreekarreddydfci A quick fix for this is explicitly specify pip requirements to skip the requirement inference: mlflow.langchain.log_model(..., pip_requirements=[...]) |
@sreekarreddydfci What are the extra fields that |
@harupy I used the quick fix, it resolved inferring pip requirements issue. When tried loading the logged model,
Got the following error:
Here's the
There's no mention of the extra fields that pydantic is complaining about. Thanks. |
@sreekarreddydfci have you checked the langchain source code to see which fields are extra. |
I've made some significant advancements in deploying Ollama models as Databricks model serving endpoints, similar to how existing LLMs are handled. Below, I outline the progress made and the current challenges. Progress Made:
Current Issues and Code Snippets:Issue 1: Customizing the Querying ScriptI need guidance on customizing the default querying script to match the existing querying format of Default Querying Script After Endpoint Setup: import os
import requests
import numpy as np
import pandas as pd
import json
def create_tf_serving_json(data):
return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}
def score_model(dataset):
url = 'https://<url>/serving-endpoints/LangchainTest1/invocations'
headers = {'Authorization': f'Bearer {os.environ.get("DATABRICKS_TOKEN")}', 'Content-Type': 'application/json'}
ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)
data_json = json.dumps(ds_dict, allow_nan=True)
response = requests.request(method='POST', headers=headers, url=url, data=data_json)
if response.status_code != 200:
raise Exception(f'Request failed with status {response.status_code}, {response.text}')
return response.json() Comparison with from openai import OpenAI
import os
DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<url>/serving-endpoints"
)
chat_completion = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are an AI assistant"},
{"role": "user", "content": "Tell me about Large Language Models"}
],
model="databricks-dbrx-instruct",
max_tokens=256
)
print(chat_completion.choices[0].message.content) Issue 2: Connection ErrorI encountered a connection issue during the endpoint querying with below script:
Here's the error log: Error Log:
Are there any other ways to serve these LLMs as endpoints? Thank you. |
@mlflow/mlflow-team Please assign a maintainer and start triaging this issue. |
Issues Policy acknowledgement
Where did you encounter this bug?
Databricks
Willingness to contribute
Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
MLflow version
System information
Describe the problem
While attempting to log and later load a customized Ollama model using MLflow and the langchain library, several critical issues were encountered related to the model's compatibility and pip requirement inference process.
Initially, the Ollama model's internal _llm_type was set to ollama-llm, which was not recognized by MLflow as a supported type. To address this, I subclassed Ollama from langchain_community.llms.ollama to override the _llm_type to just ollama. This change was intended to ensure compatibility with MLflow's model logging system.
Despite this adjustment, further complications arose during the model logging process. MLflow encountered an unexpected error while trying to infer pip requirements. The detailed error traceback indicated issues with additional fields in the Ollama class, which were not permitted, and problems during the module capture process for dependency resolution.
Tracking information
Code to reproduce issue
Stack trace
What component(s) does this bug affect?
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrationsarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templatesarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingWhat interface(s) does this bug affect?
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportWhat language(s) does this bug affect?
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsThe text was updated successfully, but these errors were encountered: