KoTox

Repository for the paper 'Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models'

The paper has been accepted for the 'Instruction Tuning and Instruction Following' workshop at NeurIPS 2023.

Paper : https://arxiv.org/abs/2311.18215

KoTox Dataset

KoTox is an automatically generated toxic instruction dataset in Korean, comprising 39K unethical instruction-output pairs.

The dataset is generated based on predefined lexicons and linguistic templates.

It is designed to address potentially harmful or misleading instructions by including outputs that refrain from providing specific opinions or information in response.

The dataset has been proven effective in mitigating toxicity in Korean Large Language Models (LLMs).

Citation

@misc{byun2023automatic,
      title={Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models}, 
      author={Sungjoo Byun and Dongjun Jang and Hyemi Jo and Hyopil Shin},
      year={2023},
      eprint={2311.18215},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
KoTox_dataset.csv		KoTox_dataset.csv
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KoTox_dataset.csv

KoTox_dataset.csv

LICENSE

LICENSE

README.md

README.md

Repository files navigation

KoTox

KoTox Dataset

Citation

About

Releases

Packages

License

byunsj/KoTox-Korean-Toxic-Instruction-Dataset

Folders and files

Latest commit

History

Repository files navigation

KoTox

KoTox Dataset

Citation

About

Topics

Resources

License

Stars

Watchers

Forks