Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous examples in Unnatural Instruction data #25

Open
qmdnls opened this issue May 16, 2023 · 0 comments
Open

Erroneous examples in Unnatural Instruction data #25

qmdnls opened this issue May 16, 2023 · 0 comments

Comments

@qmdnls
Copy link

qmdnls commented May 16, 2023

There are a lot of duplicate examples with seemingly mismatched output and label fields in unnatural_instruction_gpt4_data.json. For example:

  {
    "instruction": "Write a function that takes in a list of integers and outputs the sum of all the integers.",
    "input": "List of integers: [1, 2, 3].",
    "output": "No.",
    "label": "Yes"
  },
  {
    "instruction": "Write a function that takes in a list of integers and outputs the sum of all the integers.",
    "input": "List of integers: [1, 2, 3].",
    "output": "['What color was the lamb's fleece?'] - 5",
    "label": "6\n4"
  },

and

  {
    "instruction": "You need to read a piece of text and then answer the questions based on what is stated in the text. The questions will ask you about specific details in the text, and your answers must be drawn directly from the text itself.",
    "input": "Text: In 1789, George Washington was inaugurated as the first president of United States of America. He was 70 years old at that time. John Adams was his vice president.",
    "output": "1. How old was George Washington when he was inaugurated as the first president of the United States in 1789?\n2. Who was his vice president at that time?\n3. What was the relationship between John Adams and George Washington during their time as president and vice president?",
    "label": "1. What year was George Washington inaugurated as the first president of the United States of America?\n\n1789"
  },
  {
    "instruction": "You need to read a piece of text and then answer the questions based on what is stated in the text. The questions will ask you about specific details in the text, and your answers must be drawn directly from the text itself.",
    "input": "Text: In 1789, George Washington was inaugurated as the first president of United States of America. He was 70 years old at that time. John Adams was his vice president.",
    "output": "['Karen', 'Mark'].",
    "label": "Hazel and Lauren went out for lunch together. Hazel had been wanting to try this new restaurant for months."
  },

In the first case both examples are mismatched, in the seond case the first one seems to be correct and the second occurrence seems to be wrong. There are a lot of cases like this in the file.

Possibly related to #2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant