rlhf

Star

Here are 109 public repositories matching this topic...

MOONLAPSED / cognOS

Star

Python package for cognosis kb, syntax, and markup language. Under-construction.

agent rlhf local-llm llama2

Updated Apr 8, 2024
Python

AugustasMacijauskas / mlmi-thesis

Star

Code for my thesis titled "Eliciting latent knowledge from language reward models" for the MPhil in Machine Learning and Machine Intelligence at the University of Cambridge

alignment interpretability rlhf

Updated Oct 5, 2023
Jupyter Notebook

This project is based on fine-tuning LLM models (FLAN-T5) for text summarisation task using PEFT approach. All evaluation metrics being computed on ROUGE scoring and LoRA optimisation techniques being used for fine-tuning.

lora ppo peft ppo-agent huggingface-transformers rlhf flan-t5 llm-training

Updated Aug 8, 2023
Jupyter Notebook

colehaus / social-choice-rlhf

Star

An alternative RLHF reward model formulation from a social choice perspective

rlhf

Updated Apr 7, 2024
Python

jianzhnie / awesome-open-chatgpt

Star

Open efforts to implement ChatGPT-like models and beyond.

llama gpt4 chatgpt rlhf instruct-gpt

Updated May 10, 2023

FareedKhan-dev / Improve-Weak-LLM-Using-SPIN-Technique

Star

After RLHF and SFT show promising results, a new technique named SPIN is invented for 2024

gemini finetuning large-language-models llm rlhf

Updated Jan 17, 2024

vualidon / rewrite_retrieve_read_law

Star

RAG Law systems base on google search and Gemini Pro

law rag google-search-api llm rlhf gemini-pro

Updated Mar 14, 2024
Python

himanshuvnm / Foundation-Model-Large-Language-Model-FM-LLM

Star

This repository was commited under the action of executing important tasks on which modern Generative AI concepts are laid on. In particular, we focussed on three coding actions of Large Language Models. Extra and necessary details are given in the README.md file.

aws python3 pytorch lora rnn-pytorch attention-is-all-you-need fine-tuning hate-speech-detection huggingface huggingface-transformers foundation-models large-language-models generative-ai rlhf flan-t5 peft-fine-tuning-llm ml-m5-2xlarge low-rank-ada

Updated Mar 28, 2024
Jupyter Notebook

AMfeta99 / NLP_LLM

Star

This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.

Updated May 6, 2024
Jupyter Notebook

akain0 / Reinforcement-Learning-

Star

Projects and Models built in Python leveraging PyTorch, implementing Reinforcement Learning algorithms for reward-based tasks.

reinforcement-learning reinforcement-learning-algorithms a3c lstm-neural-networks bellman-equation rlhf

Updated May 7, 2024
Jupyter Notebook

jddunn / rlhf

Star

Library built on TextRL for easy training and usage of fine-tuned models using RLHF, a rewards model, and PPO

ppo rlhf reward-model textrl

Updated Feb 28, 2024
Python

lyndskg / ChatGPT4Me

Star

A program that enhances and customizes ChatGPT's underlying pre-trained LLM w/ transformer architecture. Based on OpenAI's beta InstructGPT fine-tune model.

supervised-learning gpt fine-tuning gpt-3 llm chatgpt chatgpt-api chatgpt3 rlhf instructgpt