Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/fix metal #148

Merged
merged 7 commits into from
Jun 3, 2024
Merged

Feature/fix metal #148

merged 7 commits into from
Jun 3, 2024

Conversation

hlhr202
Copy link
Contributor

@hlhr202 hlhr202 commented May 6, 2024

This is a naive fix for metal inference and log trampoline

sys/build.rs Outdated Show resolved Hide resolved
@hlhr202
Copy link
Contributor Author

hlhr202 commented May 16, 2024

@tazz4843 reformatted code

@tazz4843
Copy link
Owner

See review comments

@hlhr202
Copy link
Contributor Author

hlhr202 commented May 17, 2024

@tazz4843 Hey I have found a new possible way here.

whisper.cpp has now supported WHISPER_METAL_EMBED_LIBRARY build options that enable us to embed metal lib string into the build output. But we need to upgrade the whisper.cpp branch to a newer version.
See the CMakeLists in whisper.cpp here
https://github.com/ggerganov/whisper.cpp/blob/08981d1bacbe494ff1c943af6c577c669a2d9f4d/CMakeLists.txt#L78C12-L78C39
ggerganov/whisper.cpp#2110

maybe we should consider finalize #142 first then I can start implement a new build option here?

@tazz4843
Copy link
Owner

#142 has been merged, took me super long, sorry about that. Embedding the library is a much better idea imo and we should favour that.

@thewh1teagle
Copy link
Contributor

I tried it using WHISPER_METAL_EMBED_LIBRARY=ON. it works, I can see in the logs that metal framework loaded (and without it doesn't)
and it works much faster.
I believe we should enable WHISPER_METAL_EMBED_LIBRARY by default if metal feature enabled.

@hlhr202
Copy link
Contributor Author

hlhr202 commented May 29, 2024

updated. please help check the new build config. also I v updated the metal log callback setup function.
@tazz4843

@tazz4843
Copy link
Owner

I don't have macOS to test on, will wait for a positive test from someone with macOS before merging

@hlhr202
Copy link
Contributor Author

hlhr202 commented May 31, 2024

I don't have macOS to test on, will wait for a positive test from someone with macOS before merging

I have self-tested it since I m using it for a private project. But its okay if one more tester passed it.

@uohzxela
Copy link

uohzxela commented Jun 3, 2024

I have tested @hlhr202's latest changes by adding whisper-rs = { git = "https://github.com/hlhr202/whisper-rs.git", branch = "feature/fix-metal", features = ["metal"] } to my Cargo.toml and it works on my M1 Pro.

whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_backend_init: using Metal backend
whisper_model_load:    Metal total size =   487.00 MB
whisper_model_load: model size    =  487.00 MB
whisper_backend_init: using Metal backend
whisper_init_state: kv self size  =   56.62 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   22.54 MB
whisper_init_state: compute buffer (encode) =  284.81 MB
whisper_init_state: compute buffer (cross)  =    6.31 MB
whisper_init_state: compute buffer (decode) =   97.40 MB

Would be great if we can merge this, and thanks @hlhr202 for your fix!

@tazz4843 tazz4843 merged commit f1030ef into tazz4843:master Jun 3, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants