Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用iTransformer遇到问题 #1000

Open
JKYtydt opened this issue May 10, 2024 · 5 comments
Open

使用iTransformer遇到问题 #1000

JKYtydt opened this issue May 10, 2024 · 5 comments
Labels

Comments

@JKYtydt
Copy link

JKYtydt commented May 10, 2024

What happened + What you expected to happen

Traceback (most recent call last):
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script
exec(code, module.dict)
File "/sdc/jky/llm_demo/pages/1_Training.py", line 90, in
Y_hat_insample = nf.predict_insample(step_size=pred_len)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/neuralforecast/core.py", line 1175, in predict_insample
fcsts[:, col_idx : (col_idx + output_length)] = model_fcsts
ValueError: could not broadcast input array from shape (1404,1) into shape (1440,1)
在进行样本内预测时发生了报错,但是书写代码并没有发现问题

Versions / Dependencies

python=3.8
neuralforecast=1.7.2

Reproduction script

nf = NeuralForecast(
models=[model],
freq=freq
)

    nf.fit(df=train_df, val_size=pred_len)

    if not os.path.exists(save_path):
        os.makedirs(save_path)

    nf.save(path=save_path,
            model_index=None,
            overwrite=True,
            save_dataset=True)

    Y_hat_insample = nf.predict_insample(step_size=pred_len)
    print('Y_hat_insample' ,Y_hat_insample.head(),Y_hat_insample.shape)
    Y_hat_insample = pd.DataFrame(Y_hat_insample)

Issue Severity

None

@JKYtydt JKYtydt added the bug label May 10, 2024
@elephaint
Copy link
Contributor

Hi, thanks for using neuralforecast.

Can you include a fully reproducible example of code (a piece of code that I can copy-paste and run standalone on my machine), demonstrating the error? Otherwise it's very difficult for me to help.

@JKYtydt
Copy link
Author

JKYtydt commented May 13, 2024

您好,感谢您使用神经预测。

您能否提供一个完全可重现的代码示例(一段我可以复制粘贴并在我的计算机上独立运行的代码)来演示该错误?不然我很难帮忙。

您好,这是完整的代码,麻烦你了
train_df = pd.read_csv('/sdc/jky/llm_demo/coin_BinanceCoin.csv',encoding='utf-8')
print('etth1', train_df.head())
model = iTransformer(h=12,
input_size=36,
n_series=8,
hidden_size=128,
n_heads=8,
e_layers=2,
d_layers=1,
d_ff=4,
factor=1,
dropout=0.1,
use_norm=True,
loss=MSE(),
valid_loss=MSE(),
early_stop_patience_steps=3,
batch_size=24,
max_steps=5)

nf = NeuralForecast(
models=[model],
freq='D'
)

nf.fit(df=train_df, val_size=12)

nf.save(path=save_path,
model_index=None,
overwrite=True,
save_dataset=True)
print('模型保存完成')
Y_hat_insample = nf.predict_insample(step_size=pred_len)
print('Y_hat_insample' ,Y_hat_insample.head(),Y_hat_insample.shape)
Y_hat_insample = pd.DataFrame(Y_hat_insample)

Y_hat_insample['unique_id'] = Y_hat_insample.index
mae = mae(Y_hat_insample,models=['iTransformer'],id_col='unique_id')
mse = mse(Y_hat_insample,models=['iTransformer'],id_col='unique_id')
validation_df = pd.DataFrame(data={'MAE': mae['iTransformer'], 'MSE': mse['iTransformer']})

@JKYtydt
Copy link
Author

JKYtydt commented May 13, 2024

不好意思,我又遇到一个新的问题,在进行iTransformer预测时发现torch版本不能低于2.1.0,如果低于会有如下报错
Traceback (most recent call last):
File "/data/anaconda3/envs/time_py/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 600, in _run_script
exec(code, module.dict)
File "/sdc/jky/llm_demo/pages/2_Forecast.py", line 40, in
nf2 = NeuralForecast.load(path=load_path)
File "/data/anaconda3/envs/time_py/lib/python3.9/site-packages/neuralforecast/core.py", line 1357, in load
loaded_model = MODEL_FILENAME_DICT[model_class_name].load(
File "/data/anaconda3/envs/time_py/lib/python3.9/site-packages/neuralforecast/common/_base_model.py", line 351, in load
model.load_state_dict(content["state_dict"], strict=True, assign=True)
TypeError: load_state_dict() got an unexpected keyword argument 'assign'
但是如果我把版本更新到torch=2.1.0,又不能进行iTransformer的训练,也就是训练要求版本不能高于torch=2.0.1,如果高于会有如下报错
Traceback (most recent call last):
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 600, in _run_script
exec(code, module.dict)
File "/sdc/jky/llm_demo/pages/1_Training.py", line 80, in
nf.fit(df=train_df, val_size=pred_len)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/neuralforecast/core.py", line 462, in fit
self.models[i] = model.fit(
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/neuralforecast/common/_base_multivariate.py", line 537, in fit
return self._fit(
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/neuralforecast/common/_base_model.py", line 219, in _fit
trainer.fit(model, datamodule=datamodule)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
call._call_and_handle_interrupt(
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run
results = self._run_stage()
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1033, in _run_stage
self.fit_loop.run()
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run
self.advance()
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance
self.epoch_loop.run(self._data_fetcher)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run
self.advance(data_fetcher)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 250, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 190, in run
self._optimizer_step(batch_idx, closure)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 268, in _optimizer_step
call._call_lightning_module_hook(
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1303, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 152, in step
step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 239, in optimizer_step
return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision.py", line 122, in optimizer_step
return optimizer.step(closure=closure, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
return wrapped(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/optim/optimizer.py", line 373, in wrapper
out = func(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
ret = func(self, *args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/optim/adam.py", line 143, in step
loss = closure()
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision.py", line 108, in _wrap_closure
closure_result = closure()
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 144, in call
self._result = self.closure(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 138, in closure
self._backward_fn(step_output.closure_loss)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 239, in backward_fn
call._call_strategy_hook(self.trainer, "backward", loss, optimizer)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 309, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 213, in backward
self.precision_plugin.backward(closure_loss, self.lightning_module, optimizer, *args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision.py", line 72, in backward
model.backward(tensor, *args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1090, in backward
loss.backward(*args, **kwargs)
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/data/anaconda3/envs/time_py/lib/python3.8/site-packages/torch/autograd/init.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: GET was unable to find an engine to execute this computation

@elephaint
Copy link
Contributor

elephaint commented May 22, 2024

Can you upgrade your Pytorch version to the latest version? I think you might have PyTorch 2.0.0 and it seems there is an issue with our code in that version. So upgrading to 2.1+ should fix the issue.

@JKYtydt
Copy link
Author

JKYtydt commented May 23, 2024

你能将你的 Pytorch 版本升级到最新版本吗?我认为你可能使用的是 PyTorch 2.0.0,而且该版本的代码似乎存在问题。因此升级到 2.1+ 应该可以解决问题。

1、您好,您说的我都已经试过了,如果版本更新到2.1+以上,则无法训练,报错信息我已经粘贴在上述问题中
2、关于第一个问题,我依旧无法进行样本内预测

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants