Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional final response outputting entities #35

Open
kevinl424 opened this issue May 11, 2024 · 3 comments
Open

Occasional final response outputting entities #35

kevinl424 opened this issue May 11, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@kevinl424
Copy link
Collaborator

  • final response occasionally outputs the knowledge entities instead of a correct final response to the question
  • tweaking and modifying the system persona does not seem to help
  • https://github.com/seyeong-han/KnowledgeGraphRAG did not appear to have this issue, final response in this older agent outputted a valid answer each time
  • issue seemed to arise after the top K entity fix where only the top 20 entities were fed into the LLM
@kevinl424 kevinl424 added the bug Something isn't working label May 11, 2024
@kingjulio8238
Copy link
Owner

Could this bug potentially arise from the order in which we pass the top K entities into the model? Not including it first or last could potentially help with this

@kevinl424
Copy link
Collaborator Author

kevinl424 commented May 13, 2024

Update
Using gpt-3.5-preview and any more advanced models resolves the issue and the final response works quite well. Running Llama 3 8B outputs initial response fine, but when fed all the persona and knowledge entity store information, it tends to get confused and output the wrong thing.

Seems to be largely due to model ability. A potential solution can be some sort of method to detect what ollama models the user has pulled, and be able to select from there. This allows users to have much larger models (llama 3 70B, etc.) to output valid responses more consistently while still running locally. Should also probably include a note that llama 3 8B does not always work with the final response.

@kevinl424
Copy link
Collaborator Author

Further testing using requests library to get response (allows isolation of number tokens used in prompt) suggest that the context is not overflowing. Would like to try 70B but as noted in Ollama repo, you need "32 GB to run the 33B models" and more for 70B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants