Skip to content

kuvaus/gptj-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CMake

GPTJ-Chat

Simple command line chat program for GPT-J models written in C++. Based on ggml and gptj.cpp.

GPTJ-Chat demo

Table of contents

GPT-J model

You need to download a GPT-J model first. Here are direct links to models:

They're around 3.8 Gb each. The chat program stores the model in RAM on runtime so you need enough memory to run. You can get more details on GPT-J models from gpt4all.io or nomic-ai/gpt4all github.

Installation

Download

git clone --recurse-submodules https://github.com/kuvaus/gptj-chat
cd gptj-chat

Build

mkdir build
cd build
cmake ..
cmake --build . --parallel

Usage

After compiling, the binary is located at:

build/bin/chat

But you're free to move it anywhere. Simple command for 4 threads to get started:

./chat -m "/path/to/modelfile/ggml-gpt4all-j.bin" -t 4

Happy chatting!

Detailed command list

You can view the help and full parameter list with: ./chat -h

usage: ./bin/chat [options]

A simple chat program for GPT-J based models.
You can set specific initial prompt with the -p flag.
Runs default in interactive and continuous mode.
Type 'quit', 'exit' or, 'Ctrl+C' to quit.

options:
  -h, --help            show this help message and exit
  --run-once            disable continuous mode
  --no-interactive      disable interactive mode altogether (uses given prompt only)
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: empty)
  --random-prompt       start with a randomized prompt.
  -n N, --n_predict N   number of tokens to predict (default: 200)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.9)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -r N, --remember N    number of chars to remember from start of previous answer (default: 200)
  -j,   --load_json FNAME
                        load options instead from json at FNAME (default: empty/no)
  -m FNAME, --model FNAME
                        model path (current: models/ggml-gpt4all-j.bin)

You can also fetch parameters from a json file with --load_json "/path/to/file.json" flag. The json file has to be in following format:

{"top_p": 0.9, "top_k": 40, "temp": 0.9, "n_batch": 8}

This is useful when you want to store different temperature and sampling settings.

License

This project is licensed under the MIT License