Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool: Implement E2B's code interpreter #44

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Follow the instructions to configure the model - either AWS Sagemaker, Azure, or

<details>
<summary>Environment variables</summary>

### Cohere Platform

- `COHERE_API_KEY`: If your application will interface with Cohere's API, you will need to supply an API key. Not required if using AWS Sagemaker or Azure.
Expand All @@ -57,6 +57,7 @@ Then you will need to set up authorization, [see more details here](https://aws.

- `PYTHON_INTERPRETER_URL`: URL to the python interpreter container. Defaults to http://localhost:8080.
- `TAVILY_API_KEY`: If you want to enable internet search, you will need to supply a Tavily API Key. Not required.
- `E2B_API_KEY`: If you want to enable code interpreter backed by Jupyter server, terminal, filesystem, and network access for installing packages, you will need to supply an E2B API Key. Not required.

</details>

Expand Down Expand Up @@ -297,14 +298,14 @@ To create your own tools or add custom data sources, see our guide: [tools and r

### Langchain Multihop

Chatting with multihop tool usage through Langchain is enabled by setting experimental feature flag to True in `.env`.
Chatting with multihop tool usage through Langchain is enabled by setting experimental feature flag to True in `.env`.

```bash
USE_EXPERIMENTAL_LANGCHAIN=True
```

By setting this flag to true, only tools that have a Langchain implementation can be utilized.
These exist under `LANGCHAIN_TOOLS` and require a `to_lanchain_tool()` function on the tool implementation which returns a langchain compatible tool.
By setting this flag to true, only tools that have a Langchain implementation can be utilized.
These exist under `LANGCHAIN_TOOLS` and require a `to_lanchain_tool()` function on the tool implementation which returns a langchain compatible tool.
Python interpreter and Tavily Internet search are provided in the toolkit by default once the environment is set up.

Example API call:
Expand Down
14 changes: 10 additions & 4 deletions cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,17 @@ class DeploymentName(StrEnum):

class ToolName(StrEnum):
PythonInterpreter = "Python Interpreter"
CodeInterpreter = "E2B Code Interpreter"
TavilyInternetSearch = "Tavily Internet Search"


WELCOME_MESSAGE = r"""
█████╗ █████╗ ██╗ ██╗███████╗██████╗ ███████╗ ████████╗ █████╗ █████╗ ██╗ ██╗ ██╗██╗████████╗
██╔══██╗██╔══██╗██║ ██║██╔════╝██╔══██╗██╔════╝ ╚══██╔══╝██╔══██╗██╔══██╗██║ ██║ ██╔╝██║╚══██╔══╝
██║ ╚═╝██║ ██║███████║█████╗ ██████╔╝█████╗ ██║ ██║ ██║██║ ██║██║ █████═╝ ██║ ██║
██║ ██╗██║ ██║██╔══██║██╔══╝ ██╔══██╗██╔══╝ ██║ ██║ ██║██║ ██║██║ ██╔═██╗ ██║ ██║
╚█████╔╝╚█████╔╝██║ ██║███████╗██║ ██║███████╗ ██║ ╚█████╔╝╚█████╔╝███████╗██║ ╚██╗██║ ██║
╚════╝ ╚════╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝ ╚═╝ ╚════╝ ╚════╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝
██║ ╚═╝██║ ██║███████║█████╗ ██████╔╝█████╗ ██║ ██║ ██║██║ ██║██║ █████═╝ ██║ ██║
██║ ██╗██║ ██║██╔══██║██╔══╝ ██╔══██╗██╔══╝ ██║ ██║ ██║██║ ██║██║ ██╔═██╗ ██║ ██║
╚█████╔╝╚█████╔╝██║ ██║███████╗██║ ██║███████╗ ██║ ╚█████╔╝╚█████╔╝███████╗██║ ╚██╗██║ ██║
╚════╝ ╚════╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝ ╚═╝ ╚════╝ ╚════╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝
"""
DATABASE_URL_DEFAULT = "postgresql+psycopg2://postgres:postgres@db:5432"
PYTHON_INTERPRETER_URL_DEFAULT = "http://localhost:8080"
Expand Down Expand Up @@ -214,6 +215,11 @@ def show_examples():
"PYTHON_INTERPRETER_URL",
],
},
ToolName.CodeInterpreter: {
"secrets": [
"E2B_API_KEY",
],
},
ToolName.TavilyInternetSearch: {
"secrets": [
"TAVILY_API_KEY",
Expand Down
6 changes: 3 additions & 3 deletions docs/custom_tool_guides/tool_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Follow these instructions to create your own custom tools.

## Step 1: Choose a Tool to Implement

You can take a tool implementation easily from:
You can take a tool implementation easily from:

- LangChain
- Tools: [Tools | 🦜️🔗 LangChain](https://python.langchain.com/docs/integrations/tools/)
Expand Down Expand Up @@ -105,9 +105,9 @@ Note that all Retrievers should return a list of Dicts, and each Dict should con

### Implementing a Function Tool

Add the implementation inside a tool class that inherits `BaseFunctionTool` and needs to implement the function `def call(self, parameters: str, **kwargs: Any) -> List[Dict[str, Any]]:`
Add the implementation inside a tool class that inherits `BaseFunctionTool` and needs to implement the function `def call(self, parameters: str, **kwargs: Any) -> List[Dict[str, Any]]:`

For example, for calculator
For example, for calculator

```python
from typing import Any
Expand Down
70 changes: 65 additions & 5 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ py-expression-eval = "^0.3.14"
tavily-python = "^0.3.3"
arxiv = "^2.1.0"
xmltodict = "^0.13.0"
e2b-code-interpreter = "^0.0.3"

[tool.poetry.group.dev.dependencies]
pytest = "^7.1.2"
Expand Down
39 changes: 36 additions & 3 deletions src/backend/config/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,18 @@
from enum import StrEnum

from backend.schemas.tool import Category, ManagedTool
from backend.tools.function_tools import calculator, python_interpreter
from backend.tools.function_tools import (
calculator,
python_interpreter,
code_interpreter,
)
from backend.tools.retrieval import arxiv, lang_chain, llama_index, pub_med, tavily

"""
List of available tools. Each tool should have a name, implementation, is_visible and category.
List of available tools. Each tool should have a name, implementation, is_visible and category.
They can also have kwargs if necessary.

You can switch the visibility of a tool by changing the is_visible parameter to True or False.
You can switch the visibility of a tool by changing the is_visible parameter to True or False.
If a tool is not visible, it will not be shown in the frontend.

If you want to add a new tool, check the instructions on how to implement a retriever in the documentation.
Expand All @@ -23,6 +27,7 @@ class ToolName(StrEnum):
File_Upload_Langchain = "File Reader"
File_Upload_LlamaIndex = "File Reader - LlamaIndex"
Python_Interpreter = "Python_Interpreter"
Code_Interpreter = "Code_Interpreter"
Calculator = "Calculator"
Tavily_Internet_Search = "Internet Search"
Arxiv = "Arxiv"
Expand Down Expand Up @@ -68,6 +73,20 @@ class ToolName(StrEnum):
category=Category.Function,
description="Runs python code in a sandbox.",
),
ToolName.Code_Interpreter: ManagedTool(
name=ToolName.Code_Interpreter,
implementation=code_interpreter.CodeInterpreterFunctionTool,
parameter_definitions={
"code": {
"description": "The python code to execute in a single cell",
"type": "str",
"required": True,
}
},
is_visible=True,
category=Category.Function,
description="Execute python code in a Jupyter notebook cell and returns any rich data (eg charts), stdout, stderr, and error.",
),
ToolName.Calculator: ManagedTool(
name=ToolName.Calculator,
implementation=calculator.CalculatorFunctionTool,
Expand Down Expand Up @@ -113,6 +132,20 @@ class ToolName(StrEnum):
is_visible=True,
description="Runs python code in a sandbox.",
),
ToolName.Code_Interpreter: ManagedTool(
name=ToolName.Code_Interpreter,
implementation=code_interpreter.CodeInterpreterFunctionTool,
parameter_definitions={
"code": {
"description": "The python code to execute in a single cell",
"type": "str",
"required": True,
}
},
is_visible=True,
category=Category.Function,
description="Execute python code in a Jupyter notebook cell and returns any rich data (eg charts), stdout, stderr, and error.",
),
ToolName.Tavily_Internet_Search: ManagedTool(
name=ToolName.Tavily_Internet_Search,
implementation=tavily.TavilyInternetSearch,
Expand Down
57 changes: 57 additions & 0 deletions src/backend/tools/function_tools/code_interpreter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import os
from typing import Any

from langchain_core.tools import Tool as LangchainTool
from pydantic.v1 import BaseModel, Field
from e2b_code_interpreter import CodeInterpreter

from backend.tools.function_tools.base import BaseFunctionTool


class LangchainCodeInterpreterToolInput(BaseModel):
code: str = Field(description="Python code to execute.")


class CodeInterpreterFunctionTool(BaseFunctionTool):
"""
This class calls arbitrary code against a Python Jupyter notebook.
It requires an E2B_API_KEY to create a sandbox.
"""

def __init__(self):
# Instantiate the E2B sandbox - this is a long lived object
# that's pinging E2B cloud to keep the sandbox alive.
if "E2B_API_KEY" not in os.environ:
raise Exception(
"Code Interpreter tool called while E2B_API_KEY environment variable is not set. Please get your E2B api key here https://e2b.dev/docs and set the E2B_API_KEY environment variable."
)
self.code_interpreter = CodeInterpreter()

def call(self, parameters: dict, **kwargs: Any):
# TODO: E2B supports generating and streaming charts and other rich data
# because it has a full Jupyter server running inside the sandbox.
# What's the best way to send this data back to frontend and render them in chat?


code = parameters.get("code", "")
print("Code to run", code)
execution = self.code_interpreter.notebook.exec_cell(code)
return {
"results": execution.results,
"stdout": execution.logs.stdout,
"stderr": execution.logs.stderr,
"error": execution.error,
}

# langchain does not return a dict as a parameter, only a code string
def langchain_call(self, code: str):
return self.call({"code": code})

def to_langchain_tool(self) -> LangchainTool:
tool = LangchainTool(
name="code_interpreter",
description="Execute python code in a Jupyter notebook cell and returns any rich data (eg charts), stdout, stderr, and error.",
func=self.langchain_call,
)
tool.args_schema = LangchainCodeInterpreterToolInput
return tool