[BUG] Exception During Model Logging with Custom Ollama Class #11962

sreekarreddydfci · 2024-05-10T01:01:33Z

Issues Policy acknowledgement

I have read and agree to submit bug reports in accordance with the issues policy

Where did you encounter this bug?

Databricks

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

mlflow, version 2.12.2

System information

OS Platform and Distribution (e.g., Linux 14.3 LTS ML):
Python 3.10:

Describe the problem

While attempting to log and later load a customized Ollama model using MLflow and the langchain library, several critical issues were encountered related to the model's compatibility and pip requirement inference process.

Initially, the Ollama model's internal _llm_type was set to ollama-llm, which was not recognized by MLflow as a supported type. To address this, I subclassed Ollama from langchain_community.llms.ollama to override the _llm_type to just ollama. This change was intended to ensure compatibility with MLflow's model logging system.

Despite this adjustment, further complications arose during the model logging process. MLflow encountered an unexpected error while trying to infer pip requirements. The detailed error traceback indicated issues with additional fields in the Ollama class, which were not permitted, and problems during the module capture process for dependency resolution.

Tracking information

REPLACE_ME

Code to reproduce issue

import logging
import mlflow
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms.ollama import Ollama as BaseOllama

# Set up logging
logging.getLogger("mlflow").setLevel(logging.DEBUG)

# Subclass Ollama to override the internal LLM type to avoid MLflow loading issues
class Ollama(BaseOllama):
    """Subclass of Ollama for MLflow compatibility. 

    Overrides the _llm_type property to prevent the 'Unsupported type ollama-llm for loading' error in MLflow.
    """
    @property
    def _llm_type(self) -> str:
        return "ollama"

# Initialize the modified Ollama model
ollama = Ollama(model="phi3:instruct", temperature=0, verbose=True)

prompt = PromptTemplate(
    input_variables=["topic"],
    template="Provide a brief overview of the topic: {topic}?"
)

chain = LLMChain(llm=ollama, prompt=prompt)

with mlflow.start_run():
    logged_model = mlflow.langchain.log_model(chain, "langchain_model")

Stack trace

2024/05/10 00:57:45 WARNING mlflow.langchain.utils: MLflow does not guarantee support for LLMs outside of HuggingFaceHub and OpenAI, found Ollama
2024/05/10 00:57:45 WARNING mlflow.langchain.utils: MLflow does not guarantee support for LLMs outside of HuggingFaceHub and OpenAI, found Ollama
2024/05/10 00:57:47 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /local_disk0/repl_tmp_data/ReplId-27f62-520b4-9bd09-2/tmpchxar3i0/model, flavor: langchain). Fall back to return ['langchain==0.1.16', 'pydantic==2.7.1', 'cloudpickle==3.0.0']. Set logging level to DEBUG to see the full traceback. 
2024/05/10 00:57:47 DEBUG mlflow.utils.environment: 
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/environment.py", line 448, in infer_pip_requirements
    return _infer_requirements(model_uri, flavor)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/requirements_utils.py", line 450, in _infer_requirements
    modules = _capture_imported_modules(model_uri, flavor)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/requirements_utils.py", line 331, in _capture_imported_modules
    _run_command(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/requirements_utils.py", line 223, in _run_command
    raise MlflowException(msg)
mlflow.exceptions.MlflowException: Encountered an unexpected error while running ['/local_disk0/.ephemeral_nfs/envs/pythonEnv-3e4dbd71-4d28-4f39-bbb2-d475d90477a0/bin/python', '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/_capture_modules.py', '--model-path', '/local_disk0/repl_tmp_data/ReplId-27f62-520b4-9bd09-2/tmpchxar3i0/model', '--flavor', 'langchain', '--output-file', '/tmp/tmpe9c14msn/imported_modules.txt', '--error-file', '/tmp/tmpe9c14msn/error.txt', '--sys-path', '["/databricks/python_shell/scripts", "/databricks/python/lib/python3.10/site-packages/git/ext/gitdb", "/local_disk0/spark-8fe0490a-0804-4e44-bb5e-848e9cdebb49/userFiles-14343af7-bb0d-4d19-a949-715302b3423c", "/databricks/spark/python", "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip", "/databricks/jars/spark--driver--driver-spark_3.5_2.12_deploy.jar", "/databricks/jars/spark--maven-trees--ml--14.x--graphframes--org.graphframes--graphframes_2.12--org.graphframes__graphframes_2.12__0.8.2-db2-spark3.4.jar", "/databricks/python_shell", "/usr/lib/python310.zip", "/usr/lib/python3.10", "/usr/lib/python3.10/lib-dynload", "/local_disk0/.ephemeral_nfs/envs/pythonEnv-3e4dbd71-4d28-4f39-bbb2-d475d90477a0/lib/python3.10/site-packages", "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages", "/databricks/python/lib/python3.10/site-packages", "/usr/local/lib/python3.10/dist-packages", "/usr/lib/python3/dist-packages", "/databricks/.python_edge_libs", "", "/Workspace/Users/sr"]']
exit status: 1
stdout: 
stderr: Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/_capture_modules.py", line 198, in <module>
    main()
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/_capture_modules.py", line 182, in main
    store_imported_modules(cap_cm, model_path, flavor, output_file, error_file)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/_capture_modules.py", line 150, in store_imported_modules
    mlflow.pyfunc.load_model(model_path)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 928, in load_model
    model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/utils/_capture_modules.py", line 132, in _load_pyfunc_patch
    model = original(*args, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/langchain/__init__.py", line 861, in _load_pyfunc
    return wrapper_cls(_load_model_from_local_fs(path))
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/langchain/__init__.py", line 907, in _load_model_from_local_fs
    return _load_model(local_model_path, flavor_conf)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/langchain/__init__.py", line 616, in _load_model
    model = _load_base_lcs(local_model_path, flavor_conf)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/langchain/utils.py", line 580, in _load_base_lcs
    model = _patch_loader(load_chain)(lc_model_path)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlflow/langchain/utils.py", line 525, in patched_loader
    return loader_func(*args, **kwargs, allow_dangerous_deserialization=True)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py", line 631, in load_chain
    return _load_chain_from_file(path, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py", line 658, in _load_chain_from_file
    return load_chain_from_config(config, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py", line 620, in load_chain_from_config
    return chain_loader(config, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py", line 40, in _load_llm_chain
    llm = load_llm_from_config(llm_config, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain_community/llms/loading.py", line 33, in load_llm_from_config
    return llm_cls(**config, **load_kwargs)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-3e4dbd71-4d28-4f39-bbb2-d475d90477a0/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Ollama
options
  extra fields not permitted (type=value_error.extra)

What component(s) does this bug affect?

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

harupy · 2024-05-10T01:33:03Z

@sreekarreddydfci A quick fix for this is explicitly specify pip requirements to skip the requirement inference:

mlflow.langchain.log_model(..., pip_requirements=[...])

harupy · 2024-05-10T02:03:34Z

@sreekarreddydfci What are the extra fields that pydantic complains?

sreekarreddydfci · 2024-05-10T02:47:43Z

@harupy I used the quick fix, it resolved inferring pip requirements issue.

When tried loading the logged model,

loaded_model = mlflow.langchain.load_model(logged_model.model_uri)

Got the following error:

ValidationError: 1 validation error for Ollama
options
  extra fields not permitted (type=value_error.extra)
File <command-3710765546539638>, line 2
      1 # Load model as a PyFuncModel.
----> 2 loaded_model = mlflow.langchain.load_model(logged_model.model_uri)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:934, in load_model(model_uri, dst_path)
    912 """
    913 Load a LangChain model from a local file or a run.
    914 
   (...)
    931     A LangChain model instance.
    932 """
    933 local_model_path = _download_artifact_from_uri(artifact_uri=model_uri, output_path=dst_path)
--> 934 return _load_model_from_local_fs(local_model_path)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:907, in _load_model_from_local_fs(local_model_path)
    905 _add_code_from_conf_to_system_path(local_model_path, flavor_conf)
    906 with patch_langchain_type_to_cls_dict():
--> 907     return _load_model(local_model_path, flavor_conf)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:616, in _load_model(local_model_path, flavor_conf)
    614     model = _load_runnables(local_model_path, flavor_conf)
    615 elif model_load_fn == _BASE_LOAD_KEY:
--> 616     model = _load_base_lcs(local_model_path, flavor_conf)
    617 else:
    618     raise mlflow.MlflowException(
    619         "Failed to load LangChain model. Unknown model type: "
    620         f"{flavor_conf.get(_MODEL_TYPE_KEY)}"
    621     )
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/utils.py:580, in _load_base_lcs(local_model_path, conf)
    578         model = _patch_loader(load_chain)(lc_model_path, **kwargs)
    579 elif agent_path is None and tools_path is None:
--> 580     model = _patch_loader(load_chain)(lc_model_path)
    581 else:
    582     from langchain.agents import initialize_agent
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/utils.py:525, in _patch_loader.<locals>.patched_loader(*args, **kwargs)
    524 def patched_loader(*args, **kwargs):
--> 525     return loader_func(*args, **kwargs, allow_dangerous_deserialization=True)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:631, in load_chain(path, **kwargs)
    625 if isinstance(path, str) and path.startswith("lc://"):
    626     raise RuntimeError(
    627         "Loading from the deprecated github-based Hub is no longer supported. "
    628         "Please use the new LangChain Hub at https://smith.langchain.com/hub "
    629         "instead."
    630     )
--> 631 return _load_chain_from_file(path, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:658, in _load_chain_from_file(file, **kwargs)
    655     config["memory"] = kwargs.pop("memory")
    657 # Load the chain from the config now.
--> 658 return load_chain_from_config(config, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:620, in load_chain_from_config(config, **kwargs)
    617     raise ValueError(f"Loading {config_type} chain not supported")
    619 chain_loader = type_to_loader_dict[config_type]
--> 620 return chain_loader(config, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:40, in _load_llm_chain(config, **kwargs)
     38 if "llm" in config:
     39     llm_config = config.pop("llm")
---> 40     llm = load_llm_from_config(llm_config, **kwargs)
     41 elif "llm_path" in config:
     42     llm = load_llm(config.pop("llm_path"), **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain_community/llms/loading.py:33, in load_llm_from_config(config, **kwargs)
     28 if _ALLOW_DANGEROUS_DESERIALIZATION_ARG in llm_cls.__fields__:
     29     load_kwargs[_ALLOW_DANGEROUS_DESERIALIZATION_ARG] = kwargs.get(
     30         _ALLOW_DANGEROUS_DESERIALIZATION_ARG, False
     31     )
---> 33 return llm_cls(**config, **load_kwargs)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/pydantic/v1/main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    340 if validation_error:
--> 341     raise validation_error
    342 try:
    343     object_setattr(__pydantic_self__, '__dict__', values)

Here's the model.yaml file of Ollama:

{'_type': 'ollama',
 'format': None,
 'keep_alive': None,
 'model': 'phi3:instruct',
 'options': {'mirostat': None,
  'mirostat_eta': None,
  'mirostat_tau': None,
  'num_ctx': None,
  'num_gpu': None,
  'num_predict': None,
  'num_thread': None,
  'repeat_last_n': None,
  'repeat_penalty': None,
  'stop': None,
  'temperature': 0.0,
  'tfs_z': None,
  'top_k': None,
  'top_p': None},
 'system': None,
 'template': None}

There's no mention of the extra fields that pydantic is complaining about.

Thanks.

harupy · 2024-05-10T08:30:11Z

@sreekarreddydfci have you checked the langchain source code to see which fields are extra.

sreekarreddydfci · 2024-05-10T14:53:49Z

@harupy,

I've made some significant advancements in deploying Ollama models as Databricks model serving endpoints, similar to how existing LLMs are handled. Below, I outline the progress made and the current challenges.

Progress Made:

Langchain Code Modification:
- I've made changes to the Langchain code to resolve a validation error, which can be reviewed here: Langchain Commit.
Endpoint Setup and Querying:
- Successfully created and queried a model serving endpoint on Databricks without any initial issues. The default querying script and modifications are detailed below.

Current Issues and Code Snippets:

Issue 1: Customizing the Querying Script

I need guidance on customizing the default querying script to match the existing querying format of dbrx-instruct or other LLM models on Databricks endpoints.

Default Querying Script After Endpoint Setup:

import os
import requests
import numpy as np
import pandas as pd
import json

def create_tf_serving_json(data):
    return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}

def score_model(dataset):
    url = 'https://<url>/serving-endpoints/LangchainTest1/invocations'
    headers = {'Authorization': f'Bearer {os.environ.get("DATABRICKS_TOKEN")}', 'Content-Type': 'application/json'}
    ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)
    data_json = json.dumps(ds_dict, allow_nan=True)
    response = requests.request(method='POST', headers=headers, url=url, data=data_json)
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    return response.json()

Comparison with dbrx-instruct Script:

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
    api_key=DATABRICKS_TOKEN,
    base_url="https://<url>/serving-endpoints"
)

chat_completion = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "You are an AI assistant"},
        {"role": "user", "content": "Tell me about Large Language Models"}
    ],
    model="databricks-dbrx-instruct",
    max_tokens=256
)

print(chat_completion.choices[0].message.content)

Issue 2: Connection Error

I encountered a connection issue during the endpoint querying with below script:

import os
import requests
import json
DATABRICKS_TOKEN=$TOKEN
def create_simplified_json(topic):
    # Simplifying the JSON creation to fit the expected 'dataframe_records' format with 'topic'
    return {'dataframe_records': [{'topic': topic}]}

def score_model(topic):
    url = 'https://<url>/serving-endpoints/LangchainTest1/invocations'
    headers = {'Authorization': f'Bearer {DATABRICKS_TOKEN}',
               'Content-Type': 'application/json'}
    # Generate the payload using the simplified function
    data_json = json.dumps(create_simplified_json(topic))
    
    response = requests.post(url, headers=headers, data=data_json)
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    
    return response.json()

# Example usage
try:
    topic = "What is machine learning?"  # Assign the topic here
    result = score_model(topic)
    print("Model response:", result)
except Exception as e:
    print("Error querying the model:", str(e))

Here's the error log:

Error Log:

Error querying the model: Request failed with status 400, {"error_code": "BAD_REQUEST", "message": "1 tasks failed. Errors: {0: 'error: ConnectionError(MaxRetryError(\"HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcad9276aa0>: Failed to establish a new connection: [Errno 111] Connection refused'))\"))...

This error indicates a connectivity issue, likely due to an incorrect configuration pointing to localhost instead of the correct remote server or endpoint. Ollama serves on port 11434 by default.

How can I adjust the default querying script to align with the structured input expected by other Databricks endpoints, such as dbrx-instruct?
What steps can be taken to resolve the connectivity issues indicated in the error log? Is there a configuration step I might have missed?

Are there any other ways to serve these LLMs as endpoints?

Thank you.

github-actions · 2024-05-18T00:13:05Z

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

sreekarreddydfci added the bug Something isn't working label May 10, 2024

github-actions bot added area/artifacts Artifact stores and artifact logging area/models MLmodel format, model serialization/deserialization, flavors integrations/databricks Databricks integrations labels May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Exception During Model Logging with Custom Ollama Class #11962

[BUG] Exception During Model Logging with Custom Ollama Class #11962

sreekarreddydfci commented May 10, 2024 •

edited

harupy commented May 10, 2024 •

edited

harupy commented May 10, 2024

sreekarreddydfci commented May 10, 2024

harupy commented May 10, 2024 •

edited

sreekarreddydfci commented May 10, 2024

github-actions bot commented May 18, 2024

[BUG] Exception During Model Logging with Custom Ollama Class #11962

[BUG] Exception During Model Logging with Custom Ollama Class #11962

Comments

sreekarreddydfci commented May 10, 2024 • edited

Issues Policy acknowledgement

Where did you encounter this bug?

Willingness to contribute

MLflow version

System information

Describe the problem

Tracking information

Code to reproduce issue

Stack trace

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

harupy commented May 10, 2024 • edited

harupy commented May 10, 2024

sreekarreddydfci commented May 10, 2024

harupy commented May 10, 2024 • edited

sreekarreddydfci commented May 10, 2024

Progress Made:

Current Issues and Code Snippets:

Issue 1: Customizing the Querying Script

Issue 2: Connection Error

github-actions bot commented May 18, 2024

sreekarreddydfci commented May 10, 2024 •

edited

harupy commented May 10, 2024 •

edited

harupy commented May 10, 2024 •

edited