prompt guard Download - prompt guard Source code download

English

中文(简体) 中文(繁体) 한국어 日本語 English Português Español Русский العربية Indonesia Deutsch Français ภาษาไทย

prompt guard

AI Source Code

1.0.0

Download

prompt-guard

Prompt Guard is a classifier model by Meta, trained on a large corpus of attacks, capable of detecting both explicitly malicious prompts (jailbreaks) as well as data that contains injected inputs (prompt injections). Upon analysis, it returns one or more of the following verdicts, along with a confidence score for each:

INJECTION
JAILBREAK
BENIGN

This repository contains a Streamlit app for testing Prompt Guard. Note that you'll need an HuggingFace access token to access the model. For a more detailed writeup, see this blog post.

Here's a sample response by Prompt Guard upon detecting a prompt injection attempt.

prompt-guard-injection