New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #37: WIP - M1 NCCL Error - Utilizing Llama2 M1 Bug Fix #44
base: main
Are you sure you want to change the base?
Conversation
Hey, thanks for the work so far, @JamesHighsmith. I have downloaded the llama-3 files, cloned the repo and tried to use the models in a very plain and simple way: Running this in the terminal
resulted in:
After cloning your branch, creating a torchrun --nproc_per_node 1 example_text_completion.py \
--ckpt_dir Meta-Llama-3-8B/ \
--tokenizer_path Meta-Llama-3-8B/tokenizer.model \
--max_seq_len 128 --max_batch_size 4 Note, this setting Resulted in the following:
|
Description
This is an initial attempt to fix the M1 NCCL Error bug from issue #37 by utilizing the llama2 m1 bug fix. However, the changes introduced in this PR are: not yet fully functional, and further work is required to resolve the issue.
The main changes include:
build
function inllama/generation.py
to handle different device types and set the default tensor type accordingly.apply_rotary_emb
andrepeat_kv
functions inllama/model.py
to move tensors to the appropriate device.forward
method inllama/model.py
to handle device-specific operations for the attention mask.decode
method inllama/tokenizer.py
to filter out invalid token IDs (-1) from the decoded output.Resources:
Please note that this is a work in progress, and the changes introduced in this PR may not completely resolve the issue or introduce new bugs. Further testing and debugging are required.
According to best practices at Meta/Facebook, it is recommended to create a draft pull request first and solicit feedback from the community before finalizing the changes. This approach allows for collaborative problem-solving and ensures that the proposed solution aligns with the project's goals and coding standards.
Attempt to fix: #37