New Own Dataset #296

Okeke-Stephen · 2023-08-28T23:48:58Z

Okeke-Stephen
Aug 28, 2023

Hi there
Thank you for this great work.
I have been trying to use my own data to fine-tune models with the data loader below:
from datasets import load_dataset

#data = load_dataset("Abirate/english_quotes")

Load dataset (you can process it here)

data = load_dataset("csv", data_files="train.csv", split="train")

Shuffle the dataset

data = data.map(lambda samples: tokenizer(samples["text"]), batched=True)
data = data.shuffle(seed=42)
data = data.select(range(50))

then add
train_dataset=data["text"], to the trainer

However, I have been getting errors and the method does recognize train_dataset=data["text"].
The code sample can only fine-tune models with the sample dataset you provided. Can this be resolved?

Regards,
Srt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Own Dataset #296

{{title}}

Replies: 0 comments

Select a reply

New Own Dataset #296

Okeke-Stephen Aug 28, 2023

Load dataset (you can process it here)

Shuffle the dataset

Replies: 0 comments

Okeke-Stephen
Aug 28, 2023