Skip to content

Releases: pjlab-sys4nlp/llama-moe

v1.0.0-publish

25 Dec 03:02
bc0a350
Compare
Choose a tag to compare

Everything seems to be ready

v0.3.2-cpt-configs_and_scripts

22 Nov 07:24
f569ea8
Compare
Choose a tag to compare
  • add final data portion of sheared llama
  • add gate_network_type, moe_calculator_score_scale_factor, and update prob_map arguments in config
  • add exec scripts

v0.3.1-cpt-dynamic_batch_loading: Llama2 CPT with Dynamic Batch Loading

17 Nov 09:09
d6a3780
Compare
Choose a tag to compare
  • Llama2 CPT with 4096 context length training.
  • Dynamic batch loading from ShearedLlama Implementation.

v0.2.1-cpt-13b: Fix 13B CPT bugs

07 Oct 05:30
4bff10e
Compare
Choose a tag to compare
Merge pull request #31 from pjlab-sys4nlp/scaling_13b

CPT: fix tb logging, fix grad ckpting, faster data loading