Skip to content

byunsj/KoTox-Korean-Toxic-Instruction-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

KoTox

Repository for the paper 'Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models'

The paper has been accepted for the 'Instruction Tuning and Instruction Following' workshop at NeurIPS 2023.

Paper : https://arxiv.org/abs/2311.18215

KoTox Dataset

KoTox is an automatically generated toxic instruction dataset in Korean, comprising 39K unethical instruction-output pairs.

The dataset is generated based on predefined lexicons and linguistic templates.

It is designed to address potentially harmful or misleading instructions by including outputs that refrain from providing specific opinions or information in response.

The dataset has been proven effective in mitigating toxicity in Korean Large Language Models (LLMs).

Citation

@misc{byun2023automatic,
      title={Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models}, 
      author={Sungjoo Byun and Dongjun Jang and Hyemi Jo and Hyopil Shin},
      year={2023},
      eprint={2311.18215},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

KoTox is an automatically generated instruction dataset in Korean. The instruction set is used to mitigate the toxicity of the LLMs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published