Fix a shape annotation and typos in `mamba` slow forward #30691

vasqu · 2024-05-07T09:42:39Z

What does this PR do?

It only addresses typos and a wrong shape annotation in the comments of mamba's slow forward call. There's no change in the logic or anything.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker

ArthurZucker

Ok good nit! 🤗 (does jamba also has this typo?)

vasqu · 2024-05-08T16:02:26Z

Yup, same typo. The shape annotation is correct tho.

Another thing I've noticed in Jamba are these line;

transformers/src/transformers/models/jamba/modeling_jamba.py

Lines 916 to 920 in 5962d62

    
           if self.training: 
        
               # In training mode, we don't want to perform in-place operations on ssm_state so we can compute the backwards pass 
        
               ssm_state = cache_params.ssm_states[self.layer_idx].clone() 
        
           else: 
        
               ssm_state = cache_params.ssm_states[self.layer_idx]

If you remember my issue from the past ( #29526 ), I've added something similar but not differentiating between training and eval. Might be worth to change in base mamba as well.

ArthurZucker · 2024-05-20T11:55:54Z

Down to change, but with a bench / something that shows it does produce improvements!

fix typos and one shape comment

a8e12bf

ArthurZucker approved these changes May 8, 2024

View reviewed changes

fix intermediade typo in jamba

190c2d9

ArthurZucker approved these changes May 20, 2024

View reviewed changes

ArthurZucker merged commit 76e0530 into huggingface:main May 20, 2024
17 checks passed

vasqu deleted the fix-mamba-comments branch May 20, 2024 12:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a shape annotation and typos in `mamba` slow forward #30691

Fix a shape annotation and typos in `mamba` slow forward #30691

vasqu commented May 7, 2024

ArthurZucker left a comment

vasqu commented May 8, 2024

ArthurZucker commented May 20, 2024

Fix a shape annotation and typos in mamba slow forward #30691

Fix a shape annotation and typos in mamba slow forward #30691

Conversation

vasqu commented May 7, 2024

What does this PR do?

Before submitting

Who can review?

ArthurZucker left a comment

Choose a reason for hiding this comment

vasqu commented May 8, 2024

ArthurZucker commented May 20, 2024

Fix a shape annotation and typos in `mamba` slow forward #30691

Fix a shape annotation and typos in `mamba` slow forward #30691