Skip to content

mlx-community/Meta-Llama-3-8B-Instruct-4bit #1013

Closed Answered by awni
aPaleBlueDot asked this question in Q&A
Discussion options

You must be logged in to vote

Just ran it on an 8GB M2 mini at 18.5 toks-per-sec.

python -m mlx_lm.generate --model mlx-community/Meta-Llama-3-8B-Instruct-4bit --prompt "Write a story about Einstein" --temp 0.0 --max-tokens 256

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by aPaleBlueDot
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants