This app will use GPT-3.5, so we'll also need an OpenAI API key.
Create a .streamlit
folder and a secrets.toml
file with the following contents.
openai_key = "<your OpenAI API key here>"
Note: If you're using Git, be sure to add the name of this file to your .gitignore
so you don't accidentally expose your API key. If you plan to deploy this app on Streamlit Community Cloud, the following contents should be added to your app's secrets via the Community Cloud modal.
If you're working on your local machine, install dependencies using pip:
pip install streamlit openai llama-index nltk
If you're planning to deploy this app on Streamlit Community Cloud, create a requirements.txt
file with the following contents:
streamlit
openai
llama-index
nltk
The full app is only ~50 lines of code. Let's break down each section:
Required Python libraries for this app: streamlit, llama_index, openai, and nltk.
import streamlit as st
from llama_index import VectorStoreIndex, ServiceContext, Document
from llama_index.llms import OpenAI
import openai
from llama_index import SimpleDirectoryReader
- Set our OpenAI API key from the app's secrets.
- Add a heading for our app.
- Use session state to keep track of your chatbot's message history.
- Initialize the value of
st.session_state.messages
to include the chatbot's starting message, such as, "Hey, I'm a chatbot! Ask me a question."
-
We'll Store our Knowledge Base files in a folder called
data
within the app. -
We'll define a function
load_data()
to load and index data using LlamaIndex:
The load_data()
function is responsible for loading and indexing data stored in a specific directory, typically the data
directory at the base level of your repository.
Here's a step-by-step breakdown of what the function does:
-
Using SimpleDirectoryReader: LlamaIndex's
SimpleDirectoryReader
will take the path to your data directory as an argument. It's designed to automatically choose the right file reader based on the file extensions within the directory. In our case, it will select the reader for.pdf
files. The reader will then load all the files recursively when you callreader.load_data()
. -
Creating a ServiceContext Instance: Construct an instance of LlamaIndex’s
ServiceContext
. This encapsulates a collection of resources utilized during a RAG pipeline's indexing and querying processes. One of the key features ofServiceContext
is the ability to adjust settings. For instance, you can specify which LLM and embedding model you want to use. -
Setting up VectorStoreIndex: Use LlamaIndex’s
VectorStoreIndex
to create an in-memorySimpleVectorStore
. This structures your data in a manner optimized for quick context retrieval by your model. If you're interested in diving deeper, you can explore more about LlamaIndex’s indices and their inner workings. -
Caching: This function is wrapped in Streamlit’s caching decorator
st.cache_resource
to minimize the number of times the data is loaded and indexed.
The function will return the VectorStoreIndex
object upon completion.
LlamaIndex offers various modes of chat engines. Here we will use the condense_question
mode as it always queries the knowledge base which is optimal for our use-case.
Use Streamlit's UI elements to gather user input and display the chatbot's message history.
After gathering the user's question, pass it to the chat engine to get a response, then display both the question and the answer in a chat-like interface.
Run the app with streamlit run app.py
.
Deploy the app on Streamlit Community Cloud following this guide.