Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AttributeError: 'list' object has no attribute 'output_node'" #768

Open
mrcmoresi opened this issue Jan 29, 2024 · 3 comments
Open
Labels
bug Something isn't working status/needs-triage

Comments

@mrcmoresi
Copy link

mrcmoresi commented Jan 29, 2024

Bug description

I'm trying to run the example notebook with synthetic data getting-started-session-based

and I'm getting the following error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[21], line 57
     53 dataset = nvt.Dataset(df)
     55 # Generate statistics for the features and export parquet files
     56 # this step will generate the schema file
---> 57 workflow.fit_transform(dataset).to_parquet(os.path.join(INPUT_DATA_DIR, \"processed_nvt\"))

File /anaconda/envs/t4rec/lib/python3.10/site-packages/nvtabular/workflow/workflow.py:264, in Workflow.fit_transform(self, dataset)
    244 def fit_transform(self, dataset: Dataset) -> Dataset:
    245     \"\"\"Convenience method to both fit the workflow and transform the dataset in a single
    246     call. Equivalent to calling ``workflow.fit(dataset)`` followed by
    247     ``workflow.transform(dataset)``
   (...)
    262     transform
    263     \"\"\"
--> 264     self.fit(dataset)
    265     return self.transform(dataset)

File /anaconda/envs/t4rec/lib/python3.10/site-packages/nvtabular/workflow/workflow.py:228, in Workflow.fit(self, dataset)
    224 if not current_phase:
    225     # this shouldn't happen, but lets not infinite loop just in case
    226     raise RuntimeError(\"failed to find dependency-free StatOperator to fit\")
--> 228 self.executor.fit(ddf, current_phase)
    230 # Remove all the operators we processed in this phase, and remove
    231 # from the dependencies of other ops too
    232 for node in current_phase:

File /anaconda/envs/t4rec/lib/python3.10/site-packages/merlin/dag/executors.py:439, in DaskExecutor.fit(self, dataset, graph, refit)
    437 def fit(self, dataset: Dataset, graph: Graph, refit=True):
    438     if refit:
--> 439         clear_stats(graph)
    441     if not graph.output_schema:
    442         graph.construct_schema(dataset.schema)

File /anaconda/envs/t4rec/lib/python3.10/site-packages/merlin/dag/executors.py:562, in clear_stats(graph)
    555 def clear_stats(graph):
    556     \"\"\"Removes calculated statistics from each node in the workflow graph
    557 
    558     See Also
    559     --------
    560     nvtabular.ops.stat_operator.StatOperator.clear
    561     \"\"\"
--> 562     for stat in Graph.get_nodes_by_op_type([graph.output_node], StatOperator):
    563         stat.op.clear()

AttributeError: 'list' object has no attribute 'output_node'"

Steps/Code to reproduce bug

  1. I created the environment using conda using this
  2. cloned the repo and just run it

any idea what could be wrong?

Environment details

  • Transformers4Rec version: 23.4.0+3.g911355f4
  • Platform: x86_64 GNU/Linux
  • Python version: 3.10.13
  • Huggingface Transformers version:
  • PyTorch version (GPU?): 2.1.2
  • Tensorflow version (GPU?): not installed

Additional context

@mrcmoresi mrcmoresi added bug Something isn't working status/needs-triage labels Jan 29, 2024
@rnyak
Copy link
Contributor

rnyak commented Jan 29, 2024

@mrcmoresi can you pls use 23.08 version of the repos? you can use our docker image which is recommended.. nvcr.io/nvidia/merlin/merlin-pytorch:23.08

you can then rerun the example. if you still have any issue you can share here.

@mrcmoresi
Copy link
Author

Hi @rnyak thanks for your answer. I installed Transformers4Rec and NVtabular 23.08 and now I'm getting a different error
when I'm trying to run the following cell

start_time_window_index = start_window_index
final_time_window_index = final_window_index
#Iterating over days of one week
for time_index in range(start_time_window_index, final_time_window_index):
    # Set data 
    time_index_train = time_index
    time_index_eval = time_index + 1
    train_paths = glob.glob(os.path.join(OUTPUT_DIR, f"{time_index_train}/train.parquet"))
    eval_paths = glob.glob(os.path.join(OUTPUT_DIR, f"{time_index_eval}/valid.parquet"))
    print(train_paths)
    
    # Train on day related to time_index 
    print('*'*20)
    print("Launch training for day %s are:" %time_index)
    print('*'*20 + '\n')
    trainer.train_dataset_or_path = train_paths
    trainer.reset_lr_scheduler()
    trainer.train()
    trainer.state.global_step +=1
    print('finished')
    
    # Evaluate on the following day
    trainer.eval_dataset_or_path = eval_paths
    train_metrics = trainer.evaluate(metric_key_prefix='eval')
    print('*'*20)
    print("Eval results for day %s are:\t" %time_index_eval)
    print('\n' + '*'*20 + '\n')
    for key in sorted(train_metrics.keys()):
        print(" %s = %s" % (key, str(train_metrics[key]))) 
    wipe_memory()

I'm getting the following error

--------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File /anaconda/envs/t4rec/lib/python3.10/site-packages/transformers4rec/torch/trainer.py:398, in Trainer._use_cuda_amp(self)
    397 try:
--> 398     return self.use_cuda_amp
    399 except AttributeError:

AttributeError: 'Trainer' object has no attribute 'use_cuda_amp'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
Cell In[14], line 24
     22 # Evaluate on the following day
     23 trainer.eval_dataset_or_path = eval_paths
---> 24 train_metrics = trainer.evaluate(metric_key_prefix='eval')
     25 print('*'*20)
     26 print(\"Eval results for day %s are:\\t\" %time_index_eval)

File /anaconda/envs/t4rec/lib/python3.10/site-packages/transformers/trainer.py:3085, in Trainer.evaluate(self, eval_dataset, ignore_keys, metric_key_prefix)
   3082 start_time = time.time()
   3084 eval_loop = self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop
-> 3085 output = eval_loop(
   3086     eval_dataloader,
   3087     description=\"Evaluation\",
   3088     # No point gathering the predictions if there are no metrics, otherwise we defer to
   3089     # self.args.prediction_loss_only
   3090     prediction_loss_only=True if self.compute_metrics is None else None,
   3091     ignore_keys=ignore_keys,
   3092     metric_key_prefix=metric_key_prefix,
   3093 )
   3095 total_batch_size = self.args.eval_batch_size * self.args.world_size
   3096 if f\"{metric_key_prefix}_jit_compilation_time\" in output.metrics:

File /anaconda/envs/t4rec/lib/python3.10/site-packages/transformers4rec/torch/trainer.py:502, in Trainer.evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
    495 if (
    496     metric_key_prefix == \"train\"
    497     and self.args.eval_steps_on_train_set > 0
    498     and step + 1 > self.args.eval_steps_on_train_set
    499 ):
    500     break
--> 502 loss, preds, labels, outputs = self.prediction_step(
    503     model,
    504     inputs,
    505     prediction_loss_only,
    506     ignore_keys=ignore_keys,
    507     testing=testing,
    508 )
    510 # Updates metrics
    511 # TODO: compute metrics each N eval_steps to speedup evaluation
    512 metrics_results_detailed = None

File /anaconda/envs/t4rec/lib/python3.10/site-packages/transformers4rec/torch/trainer.py:363, in Trainer.prediction_step(self, model, inputs, prediction_loss_only, ignore_keys, training, testing)
    361 inputs, targets = inputs
    362 with torch.no_grad():
--> 363     if self._use_cuda_amp:
    364         with autocast():
    365             outputs = model(inputs, targets=targets, training=training, testing=testing)

File /anaconda/envs/t4rec/lib/python3.10/site-packages/transformers4rec/torch/trainer.py:400, in Trainer._use_cuda_amp(self)
    398     return self.use_cuda_amp
    399 except AttributeError:
--> 400     return self.use_amp

AttributeError: 'Trainer' object has no attribute 'use_amp'"

@rnyak
Copy link
Contributor

rnyak commented Jan 30, 2024

this might be related to Transformers version. Please check all the requirements and install the libs accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status/needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants