Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 1.55 KB

changelog.md

File metadata and controls

21 lines (17 loc) · 1.55 KB

Change info

v0.6

  • [2024-04-19] 👉 Added LLaMa3 support and first models. 👈
  • [2024-04-12] Added ORPO training as an RLHF substitute.

v0.5

  • [2024-02-08] Added DPO training as an RLHF substitute.
  • [2024-02-08] Added more translation methods like Tower Instruct (LLM) and Google Gemini via API (no GPU required)

v0.4

  • [2024-01-18] The create threads script has been removed. We now directly use the chat template provided by the base model's tokenizer, thus supporting mutliple chat/instruct/prompt templates.

v0.3

  • [2024-01-12] You can now benchmark different translation models using benchmark.py.
  • [2024-01-09] We have significantly refactored the translation process. Please follow the readme carefully if you come from v0.2.
  • [2024-01-09] We now support translation through M2M.
  • [2024-01-04] We now support translation through MADLAD. Especially for models where Helsinki has a low BLEU score (less than 40), MADLAD (or the faster M2M) is preferred. Using MADLAD drastically slows down training time, especially if you quantize (4 bit is even slower than 8 bit).
  • [2024-01-04] We now use argparser to parse command line arguments. Make sure you update your calls to our scripts accordingly. Use -h on all scripts to get help.

v0.2

  • [2023-12-29] We now batch translations in translate.py for a 30-60% speed increase. If you have checkpoints from before this date, you can not continue using the main branch but instead must use the v0.1 branch.