Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🍎 feat: Apple MLX as Known Endpoint #2580

Merged
merged 3 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
9 changes: 9 additions & 0 deletions api/server/services/Config/loadConfigModels.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,15 @@ const exampleConfig = {
fetch: false,
},
},
{
name: 'MLX',
apiKey: 'user_provided',
baseURL: 'http://localhost:8080/v1/',
models: {
default: ['Meta-Llama-3-8B-Instruct-4bit'],
fetch: false,
},
},
],
},
};
Expand Down
Binary file added client/public/assets/mlx.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions client/src/components/Chat/Menus/Endpoints/UnknownIcon.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ const knownEndpointAssets = {
[KnownEndpoints.fireworks]: '/assets/fireworks.png',
[KnownEndpoints.groq]: '/assets/groq.png',
[KnownEndpoints.mistral]: '/assets/mistral.png',
[KnownEndpoints.mlx]: '/assets/mlx.png',
[KnownEndpoints.ollama]: '/assets/ollama.png',
[KnownEndpoints.openrouter]: '/assets/openrouter.png',
[KnownEndpoints.perplexity]: '/assets/perplexity.png',
Expand Down
34 changes: 34 additions & 0 deletions docs/install/configuration/ai_endpoints.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,40 @@ Some of the endpoints are marked as **Known,** which means they might have speci

![image](https://github.com/danny-avila/LibreChat/assets/110412045/ddb4b2f3-608e-4034-9a27-3e94fc512034)

## Apple MLX
> MLX API key: ignored - [MLX OpenAI Compatibility](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)

**Notes:**

- **Known:** icon provided.

- API is mostly strict with unrecognized parameters.
- Support only one model at a time, otherwise you'll need to run a different endpoint with a different `baseURL`.

```yaml
- name: "MLX"
apiKey: "mlx"
baseURL: "http://localhost:8080/v1/"
models:
default: [
"Meta-Llama-3-8B-Instruct-4bit"
]
fetch: false # fetching list of models is not supported
titleConvo: true
titleModel: "current_model"
summarize: false
summaryModel: "current_model"
forcePrompt: false
modelDisplayLabel: "Apple MLX"
addParams:
max_tokens: 2000
"stop": [
"<|eot_id|>"
]
```

![image](https://github.com/danny-avila/LibreChat/blob/ae9d88b68c95fdb46787bca1df69407d2dd4e8dc/client/public/assets/mlx.png)

## Ollama
> Ollama API key: Required but ignored - [Ollama OpenAI Compatibility](https://github.com/ollama/ollama/blob/main/docs/openai.md)

Expand Down
1 change: 1 addition & 0 deletions docs/install/configuration/custom_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -1213,6 +1213,7 @@ Each endpoint in the `custom` array should have the following structure:
- "Perplexity"
- "together.ai"
- "Ollama"
- "MLX"

### **models**

Expand Down
30 changes: 30 additions & 0 deletions docs/install/configuration/mlx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title:  Apple MLX
description: Using LibreChat with Apple MLX
weight: -6
---
## MLX
Use [MLX](https://ml-explore.github.io/mlx/build/html/index.html) for

* Running large language models on local Apple Silicon hardware (M1, M2, M3) ARM with unified CPU/GPU memory)


### 1. Install MLX on MacOS
#### Mac MX series only
MLX supports GPU acceleration on Apple Metal backend via `mlx-lm` Python package. Follow Instructions at [Install `mlx-lm` package](https://github.com/ml-explore/mlx-examples/tree/main/llms)


### 2. Load Models with MLX
MLX supports common HuggingFace models directly, but it's recommended to use converted and tested quantized models (depending on your hardware capability) provided by the [mlx-community](https://huggingface.co/mlx-community).

Follow Instructions at [Install `mlx-lm` package](https://github.com/ml-explore/mlx-examples/tree/main/llms)

1. Browse the available models [HuggingFace](https://huggingface.co/models?search=mlx-community)
2. Copy the text from the model page `<author>/<model_id>` (ex: `mlx-community/Meta-Llama-3-8B-Instruct-4bit`)
3. Check model size. Models that can run in CPU/GPU unified memory perform the best.
4. Follow the instructions to launch the model server [Run OpenAI Compatible Server Locally](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)

```mlx_lm.server --model <author>/<model_id>```

### 3. Configure LibreChat
Use `librechat.yaml` [Configuration file (guide here)](./ai_endpoints.md) to add MLX as a separate endpoint, an example with Llama-3 is provided.
1 change: 1 addition & 0 deletions docs/install/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ weight: 1
* 🤖 [AI Setup](./configuration/ai_setup.md)
* 🚅 [LiteLLM](./configuration/litellm.md)
* 🦙 [Ollama](./configuration/ollama.md)
* 🍎 [Apple MLX](./configuration/mlx.md)
* 💸 [Free AI APIs](./configuration/free_ai_apis.md)
* 🛂 [Authentication System](./configuration/user_auth_system.md)
* 🍃 [Online MongoDB](./configuration/mongodb.md)
Expand Down
1 change: 1 addition & 0 deletions packages/data-provider/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,7 @@ export enum KnownEndpoints {
fireworks = 'fireworks',
groq = 'groq',
mistral = 'mistral',
mlx = 'mlx',
ollama = 'ollama',
openrouter = 'openrouter',
perplexity = 'perplexity',
Expand Down