Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Apple Silicon support for Chipper Model #239

Open
dsanmart opened this issue Oct 2, 2023 · 4 comments
Open

feat: Apple Silicon support for Chipper Model #239

dsanmart opened this issue Oct 2, 2023 · 4 comments

Comments

@dsanmart
Copy link

dsanmart commented Oct 2, 2023

As mentioned by @ajjimeno, the encoder is not available to MPS but the decoder is the bottleneck and can be run through a CUDA or MPS backend for GPU acceleration. This MPS backend is supported by the PyTorch framework. Pytorch backend support docs

It would just be to check if MPS is available, detach the encoder and decoder when detecting MPS instead of running model.generate, and map the computational graph of the decoder on the mps device. HugginFace example on MPS backend.

@ajjimeno
Copy link
Contributor

ajjimeno commented Oct 4, 2023

We are still integrating the new changes for Chipper, but I tried to see what could be done for MPS. After decoupling the encoder and the decoder it seems that there might be additional changes to be done on the decoder. I get a warning that indicates that some tensors need to be mapped from int64 to int32 and makes the greedy decoding as slow as just using CPU, it is even slower than CPU when beam search size = 3 is used. This seems to be an issue in the integration of MPS capabilities in PyTorch, even in the latest version of PyTorch. One option could be to modify the generator and test int32 where LongTensor is currently used or check support from PyTorch for LongTensor under MPS.

@dsanmart
Copy link
Author

dsanmart commented Oct 4, 2023

Looking at Apple forums, int64 operations are supported by the GPU accelerator.

Have you tried using the latest PyTorch nightly build? This issue was previously raised in PyTorch and also in other repos. Perhaps your PyTorch version doesn't have the LongTensor ops enabled on MPS? Could you please share the warning message you are getting?

If this was not the issue, how would you approach the first option proposed? Would it be possible to convert the input sequence to int32 before passing it to the decoder and then converting it back to int64 to avoid encountering bugs later? It would look something like this:

    input_seq = torch.tensor(input_seq, dtype=torch.int32)
    output_seq = decoder(input_seq)
    output_seq = output_seq.type(torch.int64)

@ajjimeno
Copy link
Contributor

ajjimeno commented Oct 4, 2023 via email

@dsanmart
Copy link
Author

I can imagine that you are testing it on the new code for #232 ? What encoder and decoder architectures are you using in chipper-fast-fine-tuning ?

I saw that you are using LongTensors in the prediction for logits_processor . Is what you tried converting these to int32?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants