Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any clue how to make the agent access to the KV Dataset on Langsmith? #70

Open
venturaEffect opened this issue Jan 17, 2024 · 0 comments

Comments

@venturaEffect
Copy link

Hi,

I'm trying to make that my agent has access to the KV Dataset on LangSmith but it just has no clue. I've tried everything and from what I've found so far there is not really a good explanation about how to accomplish this. It is also very weird that the Disccord server has not really support.

I'm currently lost and have no clue how to continue. Everything seems ok but it just doesn't look at the the KV Dataset on Langsmith.

Here the full code:

        import os
        from dotenv import load_dotenv
        from langchain_community.llms import Ollama
        from langchain.chains import LLMChain
        from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
        from langsmith import Client
        from langchain.memory import ConversationBufferMemory
        from langchain.tools import Tool
        from langchain_community.utilities import GoogleSearchAPIWrapper
        from pydantic import BaseModel

        # Load environment variables
        load_dotenv()

        # LangChain and Google Search setup
        langchain_tracing_v2 = os.getenv("LANGCHAIN_TRACING_V2")
        langchain_api_key = os.getenv("LANGCHAIN_API_KEY")
        langchain_endpoint = os.getenv("LANGCHAIN_ENDPOINT")
        langchain_project = os.getenv("LANGCHAIN_PROJECT")
        os.environ["GOOGLE_CSE_ID"] = os.getenv("GOOGLE_CSE_ID")
        os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY")

        # Initialize LangSmith Client
        client = Client(api_key=langchain_api_key, api_url=langchain_endpoint)

        # Define the dataset name
        dataset_name = "ai_films_business_db"

        # LangSmith Dataset functions
        def get_dataset_id(dataset_name):
            # Retrieve the dataset ID based on the given dataset name
            datasets = client.list_datasets()
            for dataset in datasets:
                if dataset.name == dataset_name:
                    return dataset.id
            raise ValueError(f"Dataset '{dataset_name}' not found.")

        def add_conversation_to_dataset(conversation_history, dataset_id):
            for user_input, ai_response in conversation_history:
                client.create_example(
                    inputs={"instruction": "Input to AI_FILMS_BUSINESS_GUY " + user_input},
                    outputs={"output": "Output from AI_FILMS_BUSINESS_GUY " + ai_response},
                    dataset_id=dataset_id
                )

        # Initialize the LLM (Ollama)
        llm = Ollama(model="dolphin-mistral", temperature=0.2)

        # Initialize memory for the conversation chain
        memory = ConversationBufferMemory()

        # Define a prompt template
        prompt_template = ChatPromptTemplate(
            messages=[
                SystemMessagePromptTemplate.from_template("You are a professional business expert in streaming, films productions, internet business models, in Intellectual property, copyright, entertainment, and artificial intelligence."),
                HumanMessagePromptTemplate.from_template("{input}")
            ]
        )

        # Define the LangSmithKVTool class using composition
        class LangSmithKVTool:
            def __init__(self, client, dataset_name):
                self.tool = Tool(name='LangSmithKVTool', func=self.retrieve_from_dataset, description='A tool for RAG using LangSmith KV dataset')
                self.client = client
                self.dataset_name = dataset_name

            def retrieve_from_dataset(self, prompt):
                # Retrieve the dataset ID
                dataset_id = get_dataset_id(self.dataset_name)
                # Read the examples within the dataset using the dataset_id
                dataset_examples = self.client.list_examples(dataset_id=dataset_id)
                # Process the dataset examples to find relevant information
                relevant_info = self.process_query_results(dataset_examples, prompt)
                return relevant_info

            def process_query_results(self, dataset_examples, prompt):
                compiled_info = ""
                for example in dataset_examples:
                    instruction = example.inputs.get('instruction')
                    output = example.outputs.get('output')
                    if prompt in instruction:
                        compiled_info += output + " "
                return compiled_info.strip()


        # Add the tool to the LLMChain
        chain = LLMChain(
            llm=llm,
            prompt=prompt_template,
            memory=memory
        )

        # Initialize the LangSmithKVTool instance
        langsmith_kv_tool = LangSmithKVTool(client, dataset_name)

        # Conversation function
        def run_conversation():
            dataset_id = get_dataset_id(dataset_name)
            conversation_history = []
            conversation_active = True
            while conversation_active:
                user_input = input("You: ")
                if user_input.lower() == "end_":
                    conversation_active = False
                    continue
                # Retrieve contextually relevant information using LangSmithKVTool
                context_info = langsmith_kv_tool.retrieve_from_dataset(user_input)
                # Combine user input with context information
                combined_input = f"{context_info} {user_input}" if context_info else user_input
                # Pass the combined input to the LLMChain using the invoke method
                response = chain.invoke({'input': combined_input})
                # Extract the response text
                response_text = response.get('text', "Sorry, I can't provide an answer right now.")
                conversation_history.append((user_input, response_text))
                print("AI:", response_text)
                if "save_" in user_input:
                    add_conversation_to_dataset(conversation_history[-1:], dataset_id)
            return conversation_history

        # Main execution
        if __name__ == "__main__":
            run_conversation()

Any help would be much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant