🐢 Open-Source Evaluation & Testing framework for LLMs and ML models
-
Updated
May 10, 2024 - Python
🐢 Open-Source Evaluation & Testing framework for LLMs and ML models
LLM App templates for RAG, knowledge mining, and stream analytics. Ready to run with Docker,⚡in sync with your data sources.
The Security Toolkit for LLM Interactions
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
An easy-to-use Python framework to generate adversarial jailbreak prompts.
Papers and resources related to the security and privacy of LLMs 🤖
Prompt injection attacks and defenses in LLM-integrated applications
The fastest && easiest LLM security and privacy guardrails for GenAI apps.
AI-driven Threat modeling-as-a-Code (TaaC-AI)
Agentic LLM Vulnerability Scanner
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins
This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.
Risks and targets for assessing LLMs & LLM vulnerabilities
Framework for LLM evaluation, guardrails and security
LLM security and privacy
A benchmark for prompt injection detection systems.
SecGPT: An execution isolation architecture for LLM-based systems
Whispers in the Machine: Confidentiality in LLM-integrated Systems
MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.
Add a description, image, and links to the llm-security topic page so that developers can more easily learn about it.
To associate your repository with the llm-security topic, visit your repo's landing page and select "manage topics."