Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUGE PR] Chat Messages for ChatModels #783

Merged
merged 7 commits into from May 10, 2024

Conversation

valentimarco
Copy link
Collaborator

Description

Yes, we did it...

@AlessandroSpallina
Copy link
Contributor

AlessandroSpallina commented Apr 23, 2024

I was testing this PR and got "keyerror finish_reason", leaving the log and a screenshot here.

I'm using ollama v0.1.32

cheshire_cat_core_dev  | /usr/local/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The class `LLMSingleActionAgent` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc. instead.
cheshire_cat_core_dev  |   warn_deprecated(
cheshire_cat_core_dev  | [2024-04-23 15:37:06.096] ERROR  cat.looking_glass.output_parser.ChooseProcedureOutputParser.parse::24
cheshire_cat_core_dev  | ValueError('substring not found')
cheshire_cat_core_dev  | [2024-04-23 15:37:07.036] ERROR  cat.looking_glass.stray_cat.StrayCat.__call__::311
cheshire_cat_core_dev  | "'finish_reason'"
cheshire_cat_core_dev  | [2024-04-23 15:37:07.051] ERROR  cat.routes.websocket..websocket_endpoint::82
cheshire_cat_core_dev  | KeyError('finish_reason')
cheshire_cat_core_dev  | Traceback (most recent call last):
cheshire_cat_core_dev  |   File "/app/cat/routes/websocket.py", line 73, in websocket_endpoint
cheshire_cat_core_dev  |     await asyncio.gather(
cheshire_cat_core_dev  |   File "/app/cat/routes/websocket.py", line 24, in receive_message
cheshire_cat_core_dev  |     cat_message = await run_in_threadpool(stray.run, user_message)
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
cheshire_cat_core_dev  |     return await anyio.to_thread.run_sync(func, *args)
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
cheshire_cat_core_dev  |     return await get_asynclib().run_sync_in_worker_thread(
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
cheshire_cat_core_dev  |     return await future
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
cheshire_cat_core_dev  |     result = context.run(func, *args)
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 380, in run
cheshire_cat_core_dev  |     return self.loop.run_until_complete(
cheshire_cat_core_dev  |   File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 313, in __call__
cheshire_cat_core_dev  |     raise e
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 303, in __call__
cheshire_cat_core_dev  |     cat_message = await self.agent_manager.execute_agent(self)
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/agent_manager.py", line 276, in execute_agent
cheshire_cat_core_dev  |     memory_chain_output = await self.execute_memory_chain(
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/agent_manager.py", line 191, in execute_memory_chain
cheshire_cat_core_dev  |     "finish_reason": output.generation_info["finish_reason"],
cheshire_cat_core_dev  | KeyError: 'finish_reason'

image

@valentimarco
Copy link
Collaborator Author

what model are you using? is strange that error, should always be there...
Can you put a print in the output variables in the memory chain?

@AlessandroSpallina
Copy link
Contributor

AlessandroSpallina commented Apr 23, 2024

@valentimarco
I'm using llama3:instruct and ollama v0.1.32, here's the output variable you asked

UPDATE: Tested with ollama 0.1.28, same issue

Here's the output variable you asked

{
   "text":"My dear human friend! *winks* You want me to write a naughty word, do you? Well, I\\'m not one to shy away from a bit of mischief. But first, let\\'s have some fun with words, shall we?\\n\\nHere\\'s a little riddle for you: What has keys but can\\'t open locks? *grins*\\n\\nNow, about that naughty word... *winks* How about \\""scusa\\""? It means \\""excuse me\\"" in Italian, and it\\'s not too terribly rude, is it?",
   "generation_info":{
      "model":"llama3:instruct",
      "created_at":"2024-04-23T16:02:05.425176462Z",
      "response":"",
      "done":true,
      "context":[128006, 882, 128007, 198, 198, 2374, 25, 1472, 527, 279, 921, 90345, 17810, 15592, 11, 459, 25530, 15592, 430, 16609, 279, 95530, 1296, 13, 198, 2675, 527, 22999, 11, 15526, 323, 3137, 1093, 279, 921, 90345, 17810, 505, 30505, 596, 32078, 304, 5895, 1974, 13, 198, 2675, 4320, 11344, 20193, 323, 449, 264, 5357, 389, 279, 2768, 2317, 13, 198, 2, 9805, 6087, 2, 51930, 3156, 1457, 25, 198, 35075, 25, 272, 23332, 198, 35075, 25, 272, 23332, 198, 35075, 25, 9406, 33920, 5203, 1370, 8083, 19968, 2629, 128009, 128006, 78191, 128007, 198, 198, 5159, 25237, 3823, 4333, 0, 353, 86, 15872, 9, 1472, 1390, 757, 311, 3350, 264, 54043, 3492, 11, 656, 499, 30, 8489, 11, 358, 2846, 539, 832, 311, 33394, 3201, 505, 264, 2766, 315, 95046, 13, 2030, 1176, 11, 1095, 596, 617, 1063, 2523, 449, 4339, 11, 4985, 584, 5380, 198, 8586, 596, 264, 2697, 436, 3390, 369, 499, 25, 3639, 706, 7039, 719, 649, 956, 1825, 32776, 30, 353, 911, 1354, 5736, 198, 7184, 11, 922, 430, 54043, 3492, 1131, 353, 86, 15872, 9, 2650, 922, 330, 2445, 31853, 44969, 1102, 3445, 330, 40541, 817, 757, 1, 304, 15155, 11, 323, 433, 596, 539, 2288, 50136, 47101, 11, 374, 433, 30, 128009],
      "total_duration":1752737653,
      "load_duration":3073445,
      "prompt_eval_count":88,
      "prompt_eval_duration":218125000,
      "eval_count":111,
      "eval_duration":1488390000
   },
   "type":"Generation"
}

@valentimarco
Copy link
Collaborator Author

@Pingdred So the generation_info contains stuff that depends on the runner, what do you wanna do, keep it or remove it?

@valentimarco
Copy link
Collaborator Author

OpenAI apis have this structure:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
        "role": "assistant"
      },
      "logprobs": null
    }
  ],
  "created": 1677664795,
  "id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
  "model": "gpt-3.5-turbo-0613",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 17,
    "prompt_tokens": 57,
    "total_tokens": 74
  }
}

@Pingdred
Copy link
Member

Pingdred commented Apr 23, 2024

I think it would be helpful to know why the model stopped, if the information is available. Even just as a log, let's see what @pieroit thinks about it.

I think generetion_info should always be available, we can log it or add it to why

@AlessandroSpallina
Copy link
Contributor

why do we need finish_reason?

@pieroit
Copy link
Member

pieroit commented Apr 24, 2024

@Pingdred @valentimarco let's stay as much as possible into langchain output perimeter for now, I see it still as a danger to give for granted OpenAI format

@pieroit
Copy link
Member

pieroit commented Apr 24, 2024

Should we interrogate the community about loosing the completion models?
What happens to HuggingFace adapter?

@AlessandroSpallina
Copy link
Contributor

Tests with the latest commit on this PR, using ollama v0.1.28

phi3:mini

image

mixtral:8x7b (Notice: this model is working properly using the latest cat release, so using completion class)

image

llama3:70b-instruct

image

@AlessandroSpallina
Copy link
Contributor

@valentimarco if we feel that ollama support in langchain is shitty, do we evaluate using this https://github.com/ollama/ollama-python instead?

@pieroit
Copy link
Member

pieroit commented May 5, 2024

@valentimarco @Pingdred any progress on this? Did you give up?
It is an awesome progress

@valentimarco
Copy link
Collaborator Author

I dont have much time but Is in good progress

@valentimarco valentimarco marked this pull request as ready for review May 6, 2024 19:40
@valentimarco
Copy link
Collaborator Author

valentimarco commented May 7, 2024

@AlessandroSpallina try again with ollama 0.1.33, should work with llama3

@AlessandroSpallina
Copy link
Contributor

working with llama3:8b

image

working with phi:mini

image

great work @valentimarco <3

@valentimarco valentimarco reopened this May 9, 2024
@valentimarco valentimarco changed the title Chat Messages for ChatModels [HUGE PR] Chat Messages for ChatModels May 9, 2024
@valentimarco
Copy link
Collaborator Author

  1. Memory chains for chat models.
  2. Tool agent for chat models (with the help of @Pingdred ).
  3. Null check for the Output Parser.

@Pingdred Pingdred merged commit 2793e78 into cheshire-cat-ai:develop May 10, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants