human-feedback

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

machine-translation llama lora contrastive gpt-4 chatgpt human-feedback instruction-tuning bloomz error-guided

Updated Oct 12, 2023
Python

conceptofmind / LaMDA-rlhf-pytorch

Star

Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.

machine-learning reinforcement-learning deep-learning transformers artificial-intelligence attention-mechanism human-feedback

Updated Feb 24, 2024
Python

opendilab / awesome-RLHF

Star

A curated list of reinforcement learning with human feedback resources (continually updated)

reinforcement-learning deep-learning deep-reinforcement-learning large-language-models human-feedback rlhf

Updated May 10, 2024

lucidrains / PaLM-rlhf-pytorch

Star

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

reinforcement-learning deep-learning transformers artificial-intelligence attention-mechanisms human-feedback

Updated Jan 14, 2024
Python

Improve this page

Add a description, image, and links to the human-feedback topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the human-feedback topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

human-feedback

Here are 15 public repositories matching this topic...

01Kevin01 / awesome-RLHF-Turkish

ZiyiZhang27 / tdpo

victor-iyi / rlhf-trl

gao-g / prelude

AlaaLab / pathologist-in-the-loop

HannahKirk / prism-alignment

PKU-Alignment / beavertails

yk7333 / d3po

trubrics / trubrics-sdk

huggingface / data-is-better-together

xrsrke / instructGOOSE

wxjiao / ParroT

conceptofmind / LaMDA-rlhf-pytorch

opendilab / awesome-RLHF

lucidrains / PaLM-rlhf-pytorch

Improve this page

Add this topic to your repo