prompt_style #1416

fireyanci · 2024-05-13T16:16:10Z

I don't want to use the dataset styles listed in prompt styles: Dict. I want to use my own defined dataset style. How can I build my own dataset style to use finetune/ora，my datasets style is
{
"conversation": [
{
"system": "This is like an instruction",
"input": "",
"output": ""
}]
}

fireyanci · 2024-05-14T08:42:10Z

because i want use Multi round conversation data

rasbt · 2024-05-20T22:32:03Z

I think the easiest way here would be to use on of the existing datasets as templates. I remember that deita had multi turn questions in the dataset, so I added this as an option. Maybe this is helpful as a template for building your own datset:

litgpt/litgpt/data/deita.py

Line 29 in cbbe9cd

include_multiturn_conversations: bool = False

But note that LitGTP otherwise doesn't do anything special for multi turn. It basically treat the data multiturn example as another regular input example during training.

fireyanci · 2024-05-21T11:48:11Z

Thank you very much for your reply，I've read your explanation about Dora, it's excellent. Thank you.I hope to use it in the LitGPT project.

rasbt · 2024-05-21T14:18:14Z

Glad to hear you found it useful! There are currently so many todos, but yeah, adding DoRA to LitGPT some time would be great.

rasbt closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt_style #1416

prompt_style #1416

fireyanci commented May 13, 2024

fireyanci commented May 14, 2024

rasbt commented May 20, 2024

fireyanci commented May 21, 2024 •

edited

rasbt commented May 21, 2024

prompt_style #1416

prompt_style #1416

Comments

fireyanci commented May 13, 2024

fireyanci commented May 14, 2024

rasbt commented May 20, 2024

fireyanci commented May 21, 2024 • edited

rasbt commented May 21, 2024

fireyanci commented May 21, 2024 •

edited