Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Function Calling for mlx_lm.server #784

Open
matmult opened this issue May 17, 2024 · 4 comments
Open

[Feature Request] Function Calling for mlx_lm.server #784

matmult opened this issue May 17, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@matmult
Copy link

matmult commented May 17, 2024

Hello, thanks for the amazing repo. I would like to request support for function calling feature for the mlx_lm server, similar to OpenAI's implementation.

Please let me know if this is on the roadmap, or if there are good frameworks that already implements this.

@awni awni added the enhancement New feature or request label May 20, 2024
@awni
Copy link
Member

awni commented May 20, 2024

It would be pretty cool to add this and perhaps not too difficult. I believe function calling requires a few things:

  • A model which supports the function calling prompt format. Do you know of a good open source model for that?
  • Updating the server API to accept the right query and return the right response
  • Converting the HTTP request input into the correct prompt for the model

Marked as an enhancement. I will leave it open if someone is interested in working on it.

@openjay
Copy link

openjay commented May 21, 2024

Are we able to integrate with open source framework? For example langchain autogen, etc.

@katopz
Copy link
Contributor

katopz commented May 25, 2024

It would be pretty cool to add this and perhaps not too difficult. I believe function calling requires a few things:

  • A model which supports the function calling prompt format. Do you know of a good open source model for that?
  • Updating the server API to accept the right query and return the right response
  • Converting the HTTP request input into the correct prompt for the model

Marked as an enhancement. I will leave it open if someone is interested in working on it.

Maybe we can take a look at ollama-mistral:v0.3

Prompt

[AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location."}}, "required": ["location", "format"]}}}][/AVAILABLE_TOOLS][INST] What is the weather like today in San Francisco [/INST]

Response

[TOOL_CALLS] [{"name": "get_current_weather", "arguments": {"location": "San Francisco, CA", "format": "celsius"}}]

Any model base on mistral 0.3 should work the same.

@otriscon
Copy link

I wrote a library that constrains LLM output to a JSON schema in a performant way, and implemented a function calling/tools server example for MLX with it. I find that it works quite well even with models that have not been fine-tuned for function calling specifically.

You can check it out here: https://github.com/otriscon/llm-structured-output

If you want to give it a try, I'm happy to answer any questions and open to suggestions for improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants