Skip to content

YutongWang1216/ReflectionLLMMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReflectionLLMMT

TasTe: Teaching Large Language Models to Translate through Self-Reflection

📣 News

🔗 Quick Links

🤖 About TasTe

The TasTe framework, which is short for Teaching Large Language Models to Translate through Self-Reflection, is designed as a two-stage inference process to enhance the translation quality of MT-LLMs. It consists of the following two stages:

  • Stage 1: Generate preliminary translations (i.e. drafts) and conduct self-assessment of the translation quality at the same time.
  • Stage 2: Refine the preliminary translations according to the predicted quality levels to obtain final outputs.

To ensure the sufficient capability for the entire self-reflective translation process, the LLMs are fine-tuned on a multi-task training set. The dataset consists of three parts:

  • Basic Translation: Common parallel corpus to provide the LLMs with correct multilingual knowledge.
  • Quality Prediction: Sources and translation candidates with their evaluated COMET scores to equip the LLMs with knowldge about translation quality and capabilities to make translation assessments.
  • Draft Refinement: Preliminary translations with their COMET scores and enhanced translations to teach the LLMs to refine drafts according to their quality scores.

The Framework of TasTe

📜 File Structure

Directory Contents
checkpoints/ Fine-tuned model checkpoints
data/ Experimental Data
infer/ Testing scripts
results/ Testing outputs
train/ Fine-tuning scripts

🛠️ Requirements

🚀 Quick Start

Installation

git clone https://github.com/YutongWang1216/ReflectionLLMMT.git
cd ReflectionLLMMT
pip install -r requirments.txt

Fine-tuning for TasTe models

(1) FixEmb: Tuning with Embedding Layers Fixed

(2) Full: Tuning with Full Parameters

Make sure to fill in the following parameters before running:

work_dir=/path/to/ReflectionLLMMT      # path to the ReflectionLLMMT root directory
model_name=name_of_your_model          # name your model, e.g. bloom_fixemb
settings=tc                            # training settings, choices=[tc, qe, mt]
premodel=/path/to/original/checkpoint  # path to the pretrained model checkpoint directory
GPU_NUM=8                              # number of available GPUs
GPU=0,1,2,3,4,5,6,7                    # GPU ids

There are three choices of training settings, corresponsing to three different training sets:

  1. tc - Fine-tune with data in data/train_tc.json to get a TasTe model in Text Classification style.
  2. qe - Fine-tune with data in data/train_qe.json to get a TasTe model in Quality Estimation style.
  3. mt - Fine-tune with data in data/train_mt.json to get a MT-baseline model.

Evaluating TasTe models

Make sure to fill in the following parameters before running:

work_dir=/path/to/ReflectionLLMMT  # path to the ReflectionLLMMT root directory
lang=zh-en                         # language pair to be tested in, choices=['zh-en', 'en-zh', 'de-en', 'en-de']
test_model=name_of_model           # name of the fine-tuned model you gave
settings=tc                        # model settings, choices=[tc, qe, mt]
GPU_NUM=8                          # number of available GPUs

There are also three choices of testing settings, corresponsing to three different training settings:

  1. tc - Test a TasTe model in Text Classification style.
  2. qe - Test a TasTe model in Quality Estimation style.
  3. mt - Test a MT-baseline model.

📝 Citation

If you find this repo useful, please cite our paper as: