LLaMA Server

LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.

🦙LLaMA C++ (via 🐍PyLLaMACpp) ➕ 🤖Chatbot UI ➕ 🔗LLaMA Server 🟰 😊

UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!

UPDATE: Now supports better streaming through PyLLaMACpp!

UPDATE: Now supports streaming!

Demo

Better Streaming

better_stream_demo.mov

Streaming

stream_demo.mov

Non-streaming

demo.mov

Setup

Get your favorite LLaMA models by
- Download from 🤗Hugging Face;
- Or follow instructions at LLaMA C++;
- Make sure models are converted and quantized;
Create a models.yml file to provide your model_home directory and add your favorite South American camelids, e.g.:

model_home: /path/to/your/models
models:
  llama-7b:
    name: LLAMA-7B
    path: 7B/ggml-model-q4_0.bin  # relative to `model_home` or an absolute path

See models.yml for an example.

Set up python environment:

conda create -n llama python=3.9
conda activate llama

Install LLaMA Server:

From PyPI:

python -m pip install llama-server

Or from source:

python -m pip install git+https://github.com/nuance1979/llama-server.git

Start LLaMA Server with your models.yml file:

llama-server --models-yml models.yml --model-id llama-7b

Check out my fork of Chatbot UI and start the app;

git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev

Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
- Or follow instructions at Chatbot UI to put your key into a .env.local file and restart;
```
cp .env.local.example .env.local
<edit .env.local to add your OPENAI_API_KEY>
```
Enjoy!

More

Try a larger model if you have it:

llama-server --models-yml models.yml --model-id llama-13b  # or any `model_id` defined in `models.yml`

Try non-streaming mode by restarting Chatbot UI:

export LLAMA_STREAM_MODE=0  # 1 to enable streaming
npm run dev

Fun facts

I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
llama_server		llama_server
models		models
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
models.yml		models.yml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

llama_server

llama_server

models

models

test

test

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

Dockerfile

Dockerfile

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

models.yml

models.yml

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

LLaMA Server

Demo

Setup

More

Fun facts

About

Releases 4

Packages

Languages

License

nuance1979/llama-server

Folders and files

Latest commit

History

Repository files navigation

LLaMA Server

Demo

Setup

More

Fun facts

About

Topics

Resources

License

Stars

Watchers

Forks

Languages