-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reformat and improve RAG module and agents #184
base: main
Are you sure you want to change the base?
Conversation
Persist function added.
persist function added.
…dd a guide agent.
Enhance RAG example
copilot dialog agents update
update as comments suggest
update as comments suggest (for docs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the inline comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the inline comments.
The current version of rag is not compatible with distributed mode. We can add support in future PRs
update as comments
…s used in previous versions, but no longer needed.
# Conflicts: # examples/conversation_with_RAG_agents/README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see inline comments
* `KnowledgeBank.add_data_as_knowledge`: 创建Knowledge模块。一种简单的方式只需要提供knowledge_id、emb_model_name和data_dirs_and_types。 | ||
```python | ||
knowledge_bank.add_data_as_knowledge( | ||
knowledge_id="agentscope_tutorial_rag", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more like a knowledge name rather than ID here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ID (identity) could be either a name or number, as long as it is unique in the set-up.
```json | ||
[ | ||
{ | ||
"knowledge_id": "{your_knowledge_id}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use a config to setup the rag module, do we consider to add a config file to explain what's the usage of each parameters? Just like this file in FederatedScope
https://github.com/alibaba/FederatedScope/blob/master/federatedscope/core/configs/config.py#L258
"description": "Code-Search-Assistant is an agent that can provide answer based on AgentScope code base. It can answer questions about specific modules in AgentScope.", | ||
"sys_prompt": "You're a coding assistant of AgentScope. The answer starts with appreciation for the question, then provide details regarding the functionality and features of the modules mentioned in the question. The language should be in a professional and simple style. The answer is limited to be less than 100 words.", | ||
"model_config_name": "qwen_config", | ||
"rag_config": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this pre-processing outside the agent? For example (taking get
as example):
knowledge_bank = KnowledgeBank(...)
knowledges = knowledge_bank.get(knowledge_ids=["kb1", "kb2"], similarity_top_k=5, log_retrieval=5, recent_n_mem=1)
AgentClass(name="assistant", knowledges=knowledges, ...)
or user can setup their own knowledge within the agent object's constructor by themselves.
There are two advantages:
- No need to know what parameters should be written in a rag config. All parameters are in the declaration of this
get()
function, which can be accessed easily. - The agent is not required to have a rag config attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support after update. Now, the knowledge can be obtained by get_knowledge
function, and a list of knowledge
can be assigned to a agent in initialization.
In this update, agents are changed to use knowledge.retrieve
function directly (the retriever is removed). The retriever is build in the knowledge.retrieve
function every time called, with the parameter provided.
while True: | ||
# The workflow is the following: | ||
# 1. user input a message, | ||
# 2. if it mentions one of the agents, then the agent will be called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we tell the user that the word mention
referred to the @
operation here?
Set the transformations as needed, or just use the default setting. | ||
|
||
Args: | ||
config (dict): a dictionary containing configurations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an issue, since the function won't expose to users, but it would be better if the config
arg is more specified( e.g. the store_and_index field required?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz see inline comments
```python | ||
knowledge_bank.add_data_as_knowledge( | ||
knowledge_id="agentscope_tutorial_rag", | ||
emb_model_name="qwen_emb_config", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config_name or model_name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is emb_model_name
src/agentscope/agents/rag_agents.py
Outdated
name: str, | ||
sys_prompt: str, | ||
model_config_name: str, | ||
memory_config: Optional[dict] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider to remove memory_config since we never use it.
…s used in previous versions, but no longer needed.
address comments
Description
Updates
Changes on code structure
llama-index
asrag_requires
insetup.py
Changes on the RAG agent module
KnowledgeBank
featureKnowledgeBank
membersChanges on the RAG/knowledge module
Improving utility of knowledge module
KnowledgeBank
KnowledgeBank
:KnowledgeBank
provides an easier way to initialize a knowledge object, just calladd_data_as_knowledge
withknowledge_id
(a string as the identifier for this knowledge object),emb_model_name
(the name of the embedding model config) anddata_dirs_and_types
(a dictionary of data directories and the wanted file extensions). As shown in therag_example.py
KnowledgeBank
can be shared and duplicated by multiple agents, which can avoid embedding duplicated documents."knowledge_id"
inknowledge_config.json
) with associated retrievers to perform multi-source information retrieval. Just need to pass the agent intoKnowledgeBank.equip
function.Toturial
Both English and Chinese tutorial are added as 209-rag.md .
Checklist
Please check the following items before code is ready to be reviewed.