Failed to get max tokens for LLM with name XXX (Ollama provider) #1342

CosmicMac · 2024-04-17T14:03:42Z

Hi,
I'm facing the following issue when trying to chat with Ollama:

04/17/2024 01:13:07 PM             utils.py 273 : Failed to get max tokens for LLM with name gemma. Defaulting to 4096.
Traceback (most recent call last):
  File "/app/danswer/llm/utils.py", line 263, in get_llm_max_tokens
    model_obj = model_map[f"{model_provider}/{model_name}"]
                ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'ollama_chat/gemma'

Then Danswer answer :)

Error in input stream

My .env:

GEN_AI_MODEL_PROVIDER=ollama_chat
GEN_AI_MODEL_VERSION=gemma
GEN_AI_API_ENDPOINT=http://host.docker.internal:11434

QA_TIMEOUT=120
DISABLE_LLM_CHOOSE_SEARCH=True
DISABLE_LLM_CHUNK_FILTER=True
DISABLE_LLM_QUERY_REPHRASE=True
DISABLE_LLM_FILTER_EXTRACTION=True
# QA_PROMPT_OVERRIDE=weak

Ollama is up and running, tested from inside danswer-stack-api_server container with curl http://host.docker.internal:11434/api/tags (obviously I had to install curl first).

BTW, Danswer seems to replay the request once in case of error.

The text was updated successfully, but these errors were encountered:

gargmukku07 · 2024-04-23T14:25:28Z

Hi,

Same error i am getting using the latest build. I am using the llama2.

gargmukku07 · 2024-04-24T08:29:52Z

This is fixed by setting the value of GEN_AI_MAX_TOKENS

al-lac · 2024-04-26T13:49:56Z

@gargmukku07 which value did you set for llama2?

LittleBennos · 2024-05-28T23:40:46Z

just wanted to add my findings

I was getting this error

05/28/2024 11:34:41 PM utils.py 328 : Failed to get max tokens for LLM with name azuregpt35turbo. Defaulting to 4096.
Traceback (most recent call last):
File "/app/danswer/llm/utils.py", line 318, in get_llm_max_tokens
model_obj = model_map[model_name]
~~~~~~~~~^^^^^^^^^^^^
KeyError: 'azuregpt35turbo'
05/28/2024 11:34:46 PM timing.py 74 : stream_chat_message took 7.445417404174805 seconds

turns out you need to set a variable for the GEN_AI_MAX_TOKENS

this is due to this section of code in backend/danswer/llm/utils.py

) -> int:
"""Best effort attempt to get the max tokens for the LLM"""
if GEN_AI_MAX_TOKENS:
# This is an override, so always return this
return GEN_AI_MAX_TOKENS

try:
    model_obj = model_map.get(f"{model_provider}/{model_name}")
    if not model_obj:
        model_obj = model_map[model_name]

    if "max_input_tokens" in model_obj:
        return model_obj["max_input_tokens"]

    if "max_tokens" in model_obj:
        return model_obj["max_tokens"]

    raise RuntimeError("No max tokens found for LLM")
except Exception:
    logger.exception(
        f"Failed to get max tokens for LLM with name {model_name}. Defaulting to 4096."
    )
    return 4096

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to get max tokens for LLM with name XXX (Ollama provider) #1342

Failed to get max tokens for LLM with name XXX (Ollama provider) #1342

CosmicMac commented Apr 17, 2024

gargmukku07 commented Apr 23, 2024

gargmukku07 commented Apr 24, 2024

al-lac commented Apr 26, 2024

LittleBennos commented May 28, 2024

Failed to get max tokens for LLM with name XXX (Ollama provider) #1342

Failed to get max tokens for LLM with name XXX (Ollama provider) #1342

Comments

CosmicMac commented Apr 17, 2024

gargmukku07 commented Apr 23, 2024

gargmukku07 commented Apr 24, 2024

al-lac commented Apr 26, 2024

LittleBennos commented May 28, 2024