⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
-
Updated
May 10, 2024 - Python
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Unify Efficient Fine-Tuning of 100+ LLMs
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
A curated list of reinforcement learning with human feedback resources (continually updated)
RewardBench: the first evaluation tool for reward models.
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Robust recipes to align language models with human and AI preferences
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.
Projects and Models built in Python leveraging PyTorch, implementing Reinforcement Learning algorithms for reward-based tasks.
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.
Codebase and experiments of LLM(Large Language Modeling)
Python client library for improving your LLM app accuracy
Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)
Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.
To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."