Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deploy my own model using ggml framework #762

Open
Francis235 opened this issue Mar 12, 2024 · 3 comments
Open

How to deploy my own model using ggml framework #762

Francis235 opened this issue Mar 12, 2024 · 3 comments

Comments

@Francis235
Copy link

How should I convert my model(e.g. .onnx format) to .gguf format and perform inference under the ggml inference framework? How should I implement it step by step?

@slaren
Copy link
Collaborator

slaren commented Mar 12, 2024

It would be easier to start from a tensorflow or pytorch model than onnx. onnx operations are lower level than most ggml operations.

@Francis235
Copy link
Author

So how to convert my pytorch model to .gguf format and perform inference under the ggml inference framework? Is there any tutorial that can guide me step by step on how to do this? I don't know how to start.

It would be easier to start from a tensorflow or pytorch model than onnx. onnx operations are lower level than most ggml operations.

@slaren
Copy link
Collaborator

slaren commented Mar 13, 2024

There isn't a step by step guide, you would have to write a program to convert the weights to a format that ggml can understand (ideally GGUF), and then you would need to look at the inference code in python and convert it to ggml operations. The examples show how to do this, but it is not explained step by step, you would have to fill in the blanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants