Skip to content

continuedev/ggml-server-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Step-by-step: run local models with GGML (~5min + download time for model weights)

Setup Python environment

  1. Clone this repository git clone https://github.com/continuedev/ggml-server-example
  2. Move into the folder: cd ggml-server-example
  3. Create a virtual environment: python3 -m venv env
  4. Activate the virtual environment: source env/bin/activate on Mac, env\Scripts\activate.bat on Windows, source env/bin/activate.fish if using fish terminal
  5. Install required packages: pip install -r requirements.txt

Download a model

  1. Download a model to the models/ folder

Serve the model

  1. Run the server with python3 -m llama_cpp.server --model models/wizardLM-7B.ggmlv3.q4_0.bin

Use with Continue

  1. To set this as your default model in Continue, you can open ~/.continue/config.json either manually or using the /config slash command in Continue. Then, import the GGML class (from continuedev.src.continuedev.libs.llm.ggml import GGML), set "default_model": "default=GGML(max_context_length=2048)", reload your VS Code window, and you're good to go!

Any questions?

Happy to help. Email use at hi@continue.dev.

About

An example of running local models with GGML

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published