Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add distributed llama on docker container test #11

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

weedge
Copy link

@weedge weedge commented Mar 6, 2024

# 1 worker + inference
make docker-1-worker-inference
# 3 workers + inference like this:
make docker-3-worker-inference WORKERS="172.18.0.2:9997 172.18.0.3:9997 172.18.0.4:9997"

my local test on docker containers: (use default checkpoint: stories42M.bin)

  1. 1 worker (1 thread) + inference (1 thread)
πŸ’‘ dim: 512
πŸ’‘ hiddenDim: 1376
πŸ’‘ nLayers: 8
πŸ’‘ nHeads: 8
πŸ’‘ nKvHeads: 8
πŸ’‘ vocabSize: 32000
πŸ’‘ seqLen: 1024
πŸ’‘ nSlices: 2
⏩ Loaded 232556544 bytes
πŸ”Ά G   38 ms I   38 ms T    0 ms S  49477 kB R     61 kB Hello
πŸ”Ά G   42 ms I   39 ms T    2 ms S     69 kB R     61 kB  was
πŸ”Ά G   44 ms I   42 ms T    1 ms S     69 kB R     61 kB  in
πŸ”Ά G   44 ms I   39 ms T    5 ms S     69 kB R     61 kB  the
πŸ”Ά G   42 ms I   42 ms T    0 ms S     69 kB R     61 kB  park
πŸ”Ά G   47 ms I   45 ms T    2 ms S     69 kB R     61 kB .
πŸ”Ά G   44 ms I   41 ms T    2 ms S     69 kB R     61 kB  It
πŸ”Ά G   43 ms I   40 ms T    3 ms S     69 kB R     61 kB  was
πŸ”Ά G   42 ms I   39 ms T    3 ms S     69 kB R     61 kB  a
πŸ”Ά G   40 ms I   39 ms T    1 ms S     69 kB R     61 kB  beautiful
πŸ”Ά G   42 ms I   38 ms T    4 ms S     69 kB R     61 kB  day
πŸ”Ά G   43 ms I   40 ms T    2 ms S     69 kB R     61 kB ,
πŸ”Ά G   43 ms I   39 ms T    3 ms S     69 kB R     61 kB  and
πŸ”Ά G   41 ms I   39 ms T    1 ms S     69 kB R     61 kB  the
πŸ”Ά G   47 ms I   40 ms T    6 ms S     69 kB R     61 kB  sun
πŸ”Ά G   45 ms I   41 ms T    4 ms S     69 kB R     61 kB  was
Generated tokens:    16
Avg generation time: 42.94 ms
Avg inference time:  40.06 ms
Avg transfer time:   2.44 ms
  1. 3 worker (1 thread) + inference (1 thread)
πŸ’‘ dim: 512
πŸ’‘ hiddenDim: 1376
πŸ’‘ nLayers: 8
πŸ’‘ nHeads: 8
πŸ’‘ nKvHeads: 8
πŸ’‘ vocabSize: 32000
πŸ’‘ seqLen: 1024
πŸ’‘ nSlices: 4
⏩ Loaded 232556544 bytes
πŸ”Ά G   41 ms I   34 ms T    7 ms S  74352 kB R     92 kB Hello
πŸ”Ά G   48 ms I   42 ms T    5 ms S    240 kB R     92 kB  was
πŸ”Ά G   65 ms I   45 ms T   18 ms S    240 kB R     92 kB  in
πŸ”Ά G   45 ms I   34 ms T   10 ms S    240 kB R     92 kB  the
πŸ”Ά G   35 ms I   33 ms T    2 ms S    240 kB R     92 kB  park
πŸ”Ά G   38 ms I   34 ms T    3 ms S    240 kB R     92 kB .
πŸ”Ά G   43 ms I   35 ms T    8 ms S    240 kB R     92 kB  It
πŸ”Ά G   47 ms I   38 ms T    8 ms S    240 kB R     92 kB  was
πŸ”Ά G   41 ms I   34 ms T    7 ms S    240 kB R     92 kB  a
πŸ”Ά G   45 ms I   38 ms T    6 ms S    240 kB R     92 kB  beautiful
πŸ”Ά G   37 ms I   35 ms T    2 ms S    240 kB R     92 kB  day
πŸ”Ά G   36 ms I   33 ms T    3 ms S    240 kB R     92 kB .
πŸ”Ά G   40 ms I   35 ms T    5 ms S    240 kB R     92 kB  There
πŸ”Ά G   40 ms I   35 ms T    5 ms S    240 kB R     92 kB  was
πŸ”Ά G   36 ms I   33 ms T    2 ms S    240 kB R     92 kB  a
πŸ”Ά G   41 ms I   33 ms T    8 ms S    240 kB R     92 kB  bird
Generated tokens:    16
Avg generation time: 42.38 ms
Avg inference time:  35.69 ms
Avg transfer time:   6.19 ms

Signed-off-by: weedge <weege007@gmail.com>
Signed-off-by: weedge <weege007@gmail.com>
@zhengpeirong
Copy link

Cool! This is of importance for research. But this project has changed a lot since your last commit. Could you please update a bit for the latest version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants