Issues: dvlab-research/LongLoRA
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
When I set
per_device_train_batch_size=2
, the S2-Attn would not shift as expected
#182
opened Mar 1, 2024 by
linhaojia13
merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge
#172
opened Jan 23, 2024 by
Spongeorge
论文中的evaluate结果,推理时用的attention是shifted sparse attention?还是full attention?
#170
opened Jan 19, 2024 by
zhangxiann
the value of loss is too unstable when supervised-finetune the 7b-100k-ft model
#168
opened Jan 18, 2024 by
seanxuu
bug report : RuntimeError: probability tensor contains either inf, nan or element < 0
#165
opened Jan 18, 2024 by
seanxuu
Configs in inference.py necessary for context length expansion in model serving?
#157
opened Dec 13, 2023 by
spring1915
Previous Next
ProTip!
Updated in the last three days: updated:>2024-05-14.