You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(rwkv5_py310) root@autodl-container-f97d11abac-813971fc:~/autodl-tmp/RWKV-LM-main/RWKV-v5# ./demo-training-run.sh
INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb972vb43
INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb972vb43/_remote_module_non_scriptable.py
INFO:pytorch_lightning.utilities.rank_zero:########## work in progress ##########
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
'allow_population_by_field_name' has been renamed to 'populate_by_name'
'validate_all' has been renamed to 'validate_default'
warnings.warn(message, UserWarning)
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/fields.py:149: UserWarning: Field "model_persistence_threshold" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
'validate_all' has been renamed to 'validate_default'
warnings.warn(message, UserWarning)
Files in model/0.1-1: ['.ipynb_checkpoints']
Traceback (most recent call last):
File "/root/autodl-tmp/RWKV-LM-main/RWKV-v5/train.py", line 165, in
max_p = list_p[-1]
IndexError: list index out of range
#!/bin/bash
BASE_NAME="model/0.1-1"
N_LAYER="32"
N_EMBD="2560"
M_BSZ="16" # takes 16G VRAM (reduce this to save VRAM)
LR_INIT="1e-5"
LR_FINAL="1e-5"
GRAD_CP=0 # set to 1 to save VRAM (will be slower)
EPOCH_SAVE=10
magic_prime = the largest 3n+2 prime smaller than datalen/ctxlen-1 (= 1498226207/512-1 = 2926222.06 in this case)
(rwkv5_py310) root@autodl-container-f97d11abac-813971fc:~/autodl-tmp/RWKV-LM-main/RWKV-v5# ./demo-training-run.sh
INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb972vb43
INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb972vb43/_remote_module_non_scriptable.py
INFO:pytorch_lightning.utilities.rank_zero:########## work in progress ##########
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
warnings.warn(message, UserWarning)
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/fields.py:149: UserWarning: Field "model_persistence_threshold" has conflict with protected namespace "model".
You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
.warnings.warn(
/root/miniconda3/envs/rwkv5_py310/lib/python3.10/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
warnings.warn(message, UserWarning)
Files in model/0.1-1: ['.ipynb_checkpoints']
Traceback (most recent call last):
File "/root/autodl-tmp/RWKV-LM-main/RWKV-v5/train.py", line 165, in
max_p = list_p[-1]
IndexError: list index out of range
#!/bin/bash
BASE_NAME="model/0.1-1"
N_LAYER="32"
N_EMBD="2560"
M_BSZ="16" # takes 16G VRAM (reduce this to save VRAM)
LR_INIT="1e-5"
LR_FINAL="1e-5"
GRAD_CP=0 # set to 1 to save VRAM (will be slower)
EPOCH_SAVE=10
magic_prime = the largest 3n+2 prime smaller than datalen/ctxlen-1 (= 1498226207/512-1 = 2926222.06 in this case)
use https://www.dcode.fr/prime-numbers-search
python train.py --load_model "/root/autodl-tmp/RWKV-LM-main/RWKV-v5/rwkv-5-World-3B-v2-20231113-ctx4096.pth" --wandb "RWKV-5-Test" --proj_dir $BASE_NAME
--ctx_len 4096 --my_pile_stage 3 --epoch_count 999999 --epoch_begin 0
--data_file "text" --my_exit_tokens 20021619 --magic_prime 4877
--num_nodes 1 --micro_bsz $M_BSZ --n_layer $N_LAYER --n_embd $N_EMBD --pre_ffn 0 --head_qk 0
--lr_init $LR_INIT --lr_final $LR_FINAL --warmup_steps 10 --beta1 0.9 --beta2 0.99 --adam_eps 1e-8 --my_pile_edecay 0 --data_type "binidx" --vocab_size 65536
--weight_decay 0.001 --epoch_save $EPOCH_SAVE --head_size_a 64
--accelerator gpu --devices 1 --precision bf16 --strategy deepspeed_stage_2 --grad_cp $GRAD_CP --enable_progress_bar True --ds_bucket_mb 200
环境按照这个装的,pip install torch==1.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install pytorch-lightning==1.9.5 deepspeed==0.7.0 wandb ninja
The text was updated successfully, but these errors were encountered: