Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable llama2 benchmarking with Turbine #2050

Open
3 of 7 tasks
kuhar opened this issue Dec 29, 2023 · 1 comment
Open
3 of 7 tasks

Enable llama2 benchmarking with Turbine #2050

kuhar opened this issue Dec 29, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request performance

Comments

@kuhar
Copy link
Member

kuhar commented Dec 29, 2023

This is extension of the main Turbine refactoring work: #1931. To enable future performance-related work, we should recreate the 1.0 benchmarking mode from vicuna.py:

Enablement

  • Allow llama2 to be run with a single prompt using a CLI script (@raikonenfnu)
  • Port the benchmarking/statistics options from vicuna.py (e.g., setting the prompts, generating exactly K output tokens, running multiple iterations and reporting the averages, etc.)
  • Add a README with benchmarking instructions

Correctness

  • Make sure the output is human-readable with 7b/13b/70b on the targets of interest (gfx9, gfx11, and others)

Performance

@kuhar
Copy link
Member Author

kuhar commented Dec 29, 2023

cc: @antiagainst @harsh-nod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

3 participants