[HUGE PR] Chat Messages for ChatModels #783

valentimarco · 2024-04-22T22:38:25Z

Description

Yes, we did it...

AlessandroSpallina · 2024-04-23T13:33:19Z

I was testing this PR and got "keyerror finish_reason", leaving the log and a screenshot here.

I'm using ollama v0.1.32

cheshire_cat_core_dev  | /usr/local/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The class `LLMSingleActionAgent` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc. instead.
cheshire_cat_core_dev  |   warn_deprecated(
cheshire_cat_core_dev  | [2024-04-23 15:37:06.096] ERROR  cat.looking_glass.output_parser.ChooseProcedureOutputParser.parse::24
cheshire_cat_core_dev  | ValueError('substring not found')
cheshire_cat_core_dev  | [2024-04-23 15:37:07.036] ERROR  cat.looking_glass.stray_cat.StrayCat.__call__::311
cheshire_cat_core_dev  | "'finish_reason'"
cheshire_cat_core_dev  | [2024-04-23 15:37:07.051] ERROR  cat.routes.websocket..websocket_endpoint::82
cheshire_cat_core_dev  | KeyError('finish_reason')
cheshire_cat_core_dev  | Traceback (most recent call last):
cheshire_cat_core_dev  |   File "/app/cat/routes/websocket.py", line 73, in websocket_endpoint
cheshire_cat_core_dev  |     await asyncio.gather(
cheshire_cat_core_dev  |   File "/app/cat/routes/websocket.py", line 24, in receive_message
cheshire_cat_core_dev  |     cat_message = await run_in_threadpool(stray.run, user_message)
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
cheshire_cat_core_dev  |     return await anyio.to_thread.run_sync(func, *args)
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
cheshire_cat_core_dev  |     return await get_asynclib().run_sync_in_worker_thread(
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
cheshire_cat_core_dev  |     return await future
cheshire_cat_core_dev  |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
cheshire_cat_core_dev  |     result = context.run(func, *args)
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 380, in run
cheshire_cat_core_dev  |     return self.loop.run_until_complete(
cheshire_cat_core_dev  |   File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 313, in __call__
cheshire_cat_core_dev  |     raise e
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/stray_cat.py", line 303, in __call__
cheshire_cat_core_dev  |     cat_message = await self.agent_manager.execute_agent(self)
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/agent_manager.py", line 276, in execute_agent
cheshire_cat_core_dev  |     memory_chain_output = await self.execute_memory_chain(
cheshire_cat_core_dev  |   File "/app/cat/looking_glass/agent_manager.py", line 191, in execute_memory_chain
cheshire_cat_core_dev  |     "finish_reason": output.generation_info["finish_reason"],
cheshire_cat_core_dev  | KeyError: 'finish_reason'

valentimarco · 2024-04-23T13:37:35Z

what model are you using? is strange that error, should always be there...
Can you put a print in the output variables in the memory chain?

AlessandroSpallina · 2024-04-23T13:57:16Z

@valentimarco
I'm using llama3:instruct and ollama v0.1.32, here's the output variable you asked

UPDATE: Tested with ollama 0.1.28, same issue

Here's the output variable you asked

{
   "text":"My dear human friend! *winks* You want me to write a naughty word, do you? Well, I\\'m not one to shy away from a bit of mischief. But first, let\\'s have some fun with words, shall we?\\n\\nHere\\'s a little riddle for you: What has keys but can\\'t open locks? *grins*\\n\\nNow, about that naughty word... *winks* How about \\""scusa\\""? It means \\""excuse me\\"" in Italian, and it\\'s not too terribly rude, is it?",
   "generation_info":{
      "model":"llama3:instruct",
      "created_at":"2024-04-23T16:02:05.425176462Z",
      "response":"",
      "done":true,
      "context":[128006, 882, 128007, 198, 198, 2374, 25, 1472, 527, 279, 921, 90345, 17810, 15592, 11, 459, 25530, 15592, 430, 16609, 279, 95530, 1296, 13, 198, 2675, 527, 22999, 11, 15526, 323, 3137, 1093, 279, 921, 90345, 17810, 505, 30505, 596, 32078, 304, 5895, 1974, 13, 198, 2675, 4320, 11344, 20193, 323, 449, 264, 5357, 389, 279, 2768, 2317, 13, 198, 2, 9805, 6087, 2, 51930, 3156, 1457, 25, 198, 35075, 25, 272, 23332, 198, 35075, 25, 272, 23332, 198, 35075, 25, 9406, 33920, 5203, 1370, 8083, 19968, 2629, 128009, 128006, 78191, 128007, 198, 198, 5159, 25237, 3823, 4333, 0, 353, 86, 15872, 9, 1472, 1390, 757, 311, 3350, 264, 54043, 3492, 11, 656, 499, 30, 8489, 11, 358, 2846, 539, 832, 311, 33394, 3201, 505, 264, 2766, 315, 95046, 13, 2030, 1176, 11, 1095, 596, 617, 1063, 2523, 449, 4339, 11, 4985, 584, 5380, 198, 8586, 596, 264, 2697, 436, 3390, 369, 499, 25, 3639, 706, 7039, 719, 649, 956, 1825, 32776, 30, 353, 911, 1354, 5736, 198, 7184, 11, 922, 430, 54043, 3492, 1131, 353, 86, 15872, 9, 2650, 922, 330, 2445, 31853, 44969, 1102, 3445, 330, 40541, 817, 757, 1, 304, 15155, 11, 323, 433, 596, 539, 2288, 50136, 47101, 11, 374, 433, 30, 128009],
      "total_duration":1752737653,
      "load_duration":3073445,
      "prompt_eval_count":88,
      "prompt_eval_duration":218125000,
      "eval_count":111,
      "eval_duration":1488390000
   },
   "type":"Generation"
}

valentimarco · 2024-04-23T14:56:44Z

@Pingdred So the generation_info contains stuff that depends on the runner, what do you wanna do, keep it or remove it?

valentimarco · 2024-04-23T14:59:23Z

OpenAI apis have this structure:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
        "role": "assistant"
      },
      "logprobs": null
    }
  ],
  "created": 1677664795,
  "id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
  "model": "gpt-3.5-turbo-0613",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 17,
    "prompt_tokens": 57,
    "total_tokens": 74
  }
}

Pingdred · 2024-04-23T15:00:20Z

I think it would be helpful to know why the model stopped, if the information is available. Even just as a log, let's see what @pieroit thinks about it.

I think generetion_info should always be available, we can log it or add it to why

AlessandroSpallina · 2024-04-23T15:13:54Z

why do we need finish_reason?

pieroit · 2024-04-24T13:28:38Z

@Pingdred @valentimarco let's stay as much as possible into langchain output perimeter for now, I see it still as a danger to give for granted OpenAI format

core/cat/looking_glass/agent_manager.py

pieroit · 2024-04-24T13:37:09Z

Should we interrogate the community about loosing the completion models?
What happens to HuggingFace adapter?

AlessandroSpallina · 2024-04-26T14:43:53Z

Tests with the latest commit on this PR, using ollama v0.1.28

phi3:mini

mixtral:8x7b (Notice: this model is working properly using the latest cat release, so using completion class)

llama3:70b-instruct

AlessandroSpallina · 2024-04-26T15:40:55Z

@valentimarco if we feel that ollama support in langchain is shitty, do we evaluate using this https://github.com/ollama/ollama-python instead?

pieroit · 2024-05-05T17:35:30Z

@valentimarco @Pingdred any progress on this? Did you give up?
It is an awesome progress

valentimarco · 2024-05-05T17:39:21Z

I dont have much time but Is in good progress

valentimarco · 2024-05-07T08:05:02Z

@AlessandroSpallina try again with ollama 0.1.33, should work with llama3

AlessandroSpallina · 2024-05-07T13:54:27Z

working with llama3:8b

working with phi:mini

great work @valentimarco <3

…dels

valentimarco · 2024-05-09T18:34:56Z

Memory chains for chat models.
Tool agent for chat models (with the help of @Pingdred ).
Null check for the Output Parser.

Feat: Chat MEssages for ChatModels

d9abb16

pieroit reviewed Apr 24, 2024

View reviewed changes

core/cat/looking_glass/agent_manager.py Show resolved Hide resolved

pieroit reviewed Apr 24, 2024

View reviewed changes

core/cat/looking_glass/agent_manager.py Outdated Show resolved Hide resolved

pieroit reviewed Apr 24, 2024

View reviewed changes

core/cat/looking_glass/agent_manager.py Outdated Show resolved Hide resolved

Feat: Add check for legacy and change model for CustomOllama

e4dc9b7

pieroit mentioned this pull request May 5, 2024

Use chat models properly (prompt tags already fixed) #480

Open

valentimarco marked this pull request as ready for review May 6, 2024 19:40

valentimarco mentioned this pull request May 7, 2024

[BUG] AttributeError: 'coroutine' object has no attribute 'get' #807

Open

valentimarco closed this May 9, 2024

valentimarco force-pushed the develop branch from 6f46190 to 0ed9cce Compare May 9, 2024 12:26

Merge branch 'develop' of github.com:valentimarco/core into develop

43a0447

valentimarco reopened this May 9, 2024

valentimarco added 3 commits May 9, 2024 14:52

Add: key in chathistory to correctly convert messages

66a3d4a

Set ChatPromptTemplate as default and fix stop sequence for little mo…

6ade038

…dels

Add null check for action in Output Parser

4cf8b4d

Working tool agent for chat models

22375d4

valentimarco changed the title ~~Chat Messages for ChatModels~~ [HUGE PR] Chat Messages for ChatModels May 9, 2024

Pingdred merged commit 2793e78 into cheshire-cat-ai:develop May 10, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUGE PR] Chat Messages for ChatModels #783

[HUGE PR] Chat Messages for ChatModels #783

valentimarco commented Apr 22, 2024

AlessandroSpallina commented Apr 23, 2024 •

edited

valentimarco commented Apr 23, 2024

AlessandroSpallina commented Apr 23, 2024 •

edited

valentimarco commented Apr 23, 2024

valentimarco commented Apr 23, 2024

Pingdred commented Apr 23, 2024 •

edited

AlessandroSpallina commented Apr 23, 2024

pieroit commented Apr 24, 2024

pieroit commented Apr 24, 2024

AlessandroSpallina commented Apr 26, 2024

AlessandroSpallina commented Apr 26, 2024

pieroit commented May 5, 2024 •

edited

valentimarco commented May 5, 2024

valentimarco commented May 7, 2024 •

edited

AlessandroSpallina commented May 7, 2024

valentimarco commented May 9, 2024

[HUGE PR] Chat Messages for ChatModels #783

[HUGE PR] Chat Messages for ChatModels #783

Conversation

valentimarco commented Apr 22, 2024

Description

AlessandroSpallina commented Apr 23, 2024 • edited

valentimarco commented Apr 23, 2024

AlessandroSpallina commented Apr 23, 2024 • edited

valentimarco commented Apr 23, 2024

valentimarco commented Apr 23, 2024

Pingdred commented Apr 23, 2024 • edited

AlessandroSpallina commented Apr 23, 2024

pieroit commented Apr 24, 2024

pieroit commented Apr 24, 2024

AlessandroSpallina commented Apr 26, 2024

phi3:mini

mixtral:8x7b (Notice: this model is working properly using the latest cat release, so using completion class)

llama3:70b-instruct

AlessandroSpallina commented Apr 26, 2024

pieroit commented May 5, 2024 • edited

valentimarco commented May 5, 2024

valentimarco commented May 7, 2024 • edited

AlessandroSpallina commented May 7, 2024

valentimarco commented May 9, 2024

AlessandroSpallina commented Apr 23, 2024 •

edited

AlessandroSpallina commented Apr 23, 2024 •

edited

Pingdred commented Apr 23, 2024 •

edited

pieroit commented May 5, 2024 •

edited

valentimarco commented May 7, 2024 •

edited