Skip to content

Commit

Permalink
Fixing minor issues in llama2 7b repro (#2926)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #2926

Fixing issues we've seen in #2907 and #2805

bypass-github-export-checks
bypass-github-pytorch-ci-checks
bypass-github-executorch-ci-checks

Reviewed By: iseeyuan, cccclai

Differential Revision: D55893925

fbshipit-source-id: c6e0264d868cb487faf02f95ff1bd223cbcc97ac
(cherry picked from commit 6db9d72)
  • Loading branch information
mergennachin committed Apr 9, 2024
1 parent 002ae53 commit fa4d88d
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion examples/models/llama2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,17 @@ You can export and run the original Llama2 7B model.

1. Llama2 pretrained parameters can be downloaded from [Meta's official website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) or from [Hugging Face](https://huggingface.co/meta-llama/Llama-2-7b).

2. Export model and generate `.pte` file:
2. Edit `params.json` file. Replace `"vocab_size": -1` with `"vocab_size": 32000`. This is a short-term workaround.

3. Export model and generate `.pte` file:
```
python -m examples.models.llama2.export_llama --checkpoint <checkpoint.pth> --params <params.json> -kv --use_sdpa_with_kv_cache -X -qmode 8da4w --group_size 128 -d fp32
```
4. Create tokenizer.bin.

```
python -m examples.models.llama2.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin
```

### Option B: Download and export stories110M model

Expand Down

0 comments on commit fa4d88d

Please sign in to comment.