Issues: hpcaitech/ColossalAI
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG]: TypeError: LlamaInferenceForwards.llama_causal_lm_forward() got an unexpected keyword argument 'shard_config'
bug
Something isn't working
#5729
opened May 17, 2024 by
hiprince
1 task done
[BUG]: No module named 'dropout_layer_norm'
bug
Something isn't working
#5726
opened May 17, 2024 by
apachemycat
1 task done
[BUG]: TypeError: _gen_python_code() got an unexpected keyword argument 'verbose'
bug
Something isn't working
#5673
opened Apr 29, 2024 by
Xingzhi107
[BUG]: GROK-1 does not support do_sample
bug
Something isn't working
#5672
opened Apr 28, 2024 by
vsmelov
[PROPOSAL]: Fix potential github action smells
enhancement
New feature or request
#5667
opened Apr 28, 2024 by
ceddy4395
1 task done
[BUG]: ColossalMoE Train: AssertionError: Parameters are expected to have the same dtype Something isn't working
torch.bfloat16
, but got torch.float32
bug
#5664
opened Apr 26, 2024 by
Camille7777
[BUG]: re-join str type error_msgs using Something isn't working
\n\t
in general_checkpoint_io
bug
#5615
opened Apr 21, 2024 by
ericxsun
[BUG] [Shardformer]: Error in blip2 testing with half precision
bug
Something isn't working
#5600
opened Apr 15, 2024 by
insujang
[BUG]: pretraing llama2 using "gemini" plugin, can not resume from saved checkpoints
bug
Something isn't working
#5597
opened Apr 15, 2024 by
jiejie1993
[BUG]: Running ColossalAI in H800 with torch 2.0
bug
Something isn't working
#5594
opened Apr 13, 2024 by
wxthu
[DOC]: What is the datasetset used to train the Colossal-Llama-2?
documentation
Improvements or additions to documentation
#5587
opened Apr 11, 2024 by
ello0211
[BUG]: OOM when saving 70B model
bug
Something isn't working
#5585
opened Apr 11, 2024 by
jiejie1993
[FEATURE]: Support qwen2 model
enhancement
New feature or request
#5573
opened Apr 9, 2024 by
wangbluo
[BUG]: AttributeError: type object 'ColoParameter' has no attribute 'from_torch_tensor' when run hybrid_parallel example
bug
Something isn't working
#5571
opened Apr 8, 2024 by
ztorchan
[BUG]: ValueError: mutable default <class 'colossalai.legacy.tensor.distspec._DistSpec'> for field dist_attr is not allowed: use default_factory
bug
Something isn't working
#5564
opened Apr 7, 2024 by
fangbrodie
TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len'
bug
Something isn't working
#5555
opened Apr 4, 2024 by
alphanlp
[BUG]: OOM during llama2 pretraining with flashattention and PP
bug
Something isn't working
#5549
opened Apr 3, 2024 by
insujang
[FEATURE]: pretrain data example
enhancement
New feature or request
#5542
opened Apr 1, 2024 by
alphanlp
[BUG]: HybridParallelOptimizer holds unsharded model parameters after sharding
bug
Something isn't working
#5539
opened Mar 31, 2024 by
insujang
[BUG]: Cannot build extensions when no gpu device exists
bug
Something isn't working
#5534
opened Mar 28, 2024 by
ccoulombe
[BUG]: Coati Lora incompatible with Gemini & HybridParallel(pp=1), but runs well with HybridParallel(tp>=2)
bug
Something isn't working
#5507
opened Mar 26, 2024 by
Fallqs
[FEATURE]: Upgrade the transformers version from 4.33.0 to 4.36.0 for Shardformer.
enhancement
New feature or request
#5505
opened Mar 25, 2024 by
wangbluo
[FEATURE]: support dit in Shardformer
enhancement
New feature or request
#5494
opened Mar 23, 2024 by
likelyzhao
[BUG]: Size mismatch is ignored when loading checkpoint
bug
Something isn't working
#5492
opened Mar 22, 2024 by
KimbingNg
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.