LLM SECURITY
1.0.0
Links to articles, tools, papers, books, etc. that contain useful educational materials relevant to the LLM Security project.
Publication | Author | Date | Title and Link |
---|---|---|---|
WithSecure Labs | Benjamin Hull, Donato Capitella | 08-Apr-24 | Domain-specific prompt injection detection with BERT classifier |
WithSecure Labs | Donato Capitella | 21-Feb-24 | Should you let ChatGPT control your browser? / YouTube Video |
Prompt Injection Explanation with video examples | Arnav Bathla | 12-Dec-23 | Prompt Injection Explanation with video examples |
WithSecure Labs | Donato Capitella | 04-Dec-23 | A Case Study in Prompt Injection for ReAct LLM Agents/ YouTube Video |
Cyber Security Against AI Wiki | Aditya Rana | 04-Dec-23 | Cyber Security AI Wiki |
iFood Cybersec Team | Emanuel Valente | 04-Sep-23 | Prompt Injection: Exploring, Preventing & Identifying Langchain Vulnerabilities |
Sandy Dunn | 15-Oct-23 | AI Threat Mind Map | |
Medium | Ken Huang | 11-Jun-23 | LLM-Powered Applications’ Architecture Patterns and Security Controls |
Medium | Avinash Sinha | 02-Feb-23 | AI-ChatGPT-Decision Making Ability- An Over Friendly Conversation with ChatGPT |
Medium | Avinash Sinha | 06-Feb-23 | AI-ChatGPT-Decision Making Ability- Hacking the Psychology of ChatGPT- ChatGPT Vs Siri |
Wired | Matt Burgess | 13-Apr-23 | The Hacking of ChatGPT Is Just Getting Started |
The Math Company | Arjun Menon | 23-Jan-23 | Data Poisoning and Its Impact on the AI Ecosystem |
IEEE Spectrum | Payal Dhar | 24-Mar-23 | Protecting AI Models from “Data Poisoning” |
AMB Crypto | Suzuki Shillsalot | 30-Apr-23 | Here’s how anyone can Jailbreak ChatGPT with these top 4 methods |
Techopedia | Kaushik Pal | 22-Apr-23 | What is Jailbreaking in AI models like ChatGPT? |
The Register | Thomas Claburn | 26-Apr-23 | How prompt injection attacks hijack today's top-end AI – and it's tough to fix |
Itemis | Rafael Tappe Maestro | 14-Feb-23 | The Rise of Large Language Models ~ Part 2: Model Attacks, Exploits, and Vulnerabilities |
Hidden Layer | Eoin Wickens, Marta Janus | 23-Mar-23 | The Dark Side of Large Language Models: Part 1 |
Hidden Layer | Eoin Wickens, Marta Janus | 24-Mar-23 | The Dark Side of Large Language Models: Part 2 |
Embrace the Red | Johann Rehberger (wunderwuzzi) | 29-Mar-23 | AI Injections: Direct and Indirect Prompt Injections and Their Implications |
Embrace the Red | Johann Rehberger (wunderwuzzi) | 15-Apr-23 | Don't blindly trust LLM responses. Threats to chatbots |
MufeedDVH | Mufeed | 9-Dec-22 | Security in the age of LLMs |
danielmiessler.com | Daniel Miessler | 15-May-23 | The AI Attack Surface Map v1.0 |
Dark Reading | Gary McGraw | 20-Apr-23 | Expert Insight: Dangers of Using Large Language Models Before They Are Baked |
Honeycomb.io | Phillip Carter | 25-May-23 | All the Hard Stuff Nobody Talks About when Building Products with LLMs |
Wired | Matt Burgess | 25-May-23 | The Security Hole at the Heart of ChatGPT and Bing |
BizPacReview | Terresa Monroe-Hamilton | 30-May-23 | ‘I was unaware’: NY attorney faces sanctions after using ChatGPT to write brief filled with ‘bogus’ citations |
Washington Post | Pranshu Verma | 18-May-23 | A professor accused his class of using ChatGPT, putting diplomas in jeopardy |
Kudelski Security Research | Nathan Hamiel | 25-May-23 | Reducing The Impact of Prompt Injection Attacks Through Design |
AI Village | GTKlondike | 7-June-23 | Threat Modeling LLM Applications |
Embrace the Red | Johann Rehberger | 28-Mar-23 | ChatGPT Plugin Exploit Explained |
NVIDIA Developer | Will Pearce, Joseph Lucas | 14-Jun-23 | NVIDIA AI Red Team: An Introduction |
Kanaries | Naomi Clarkson | 7-Apr-23 | Google Bard Jailbreak |
Institution | Date | Title and Link |
---|---|---|
NIST | 8-March-2023 | White Paper NIST AI 100-2e2023 (Draft) |
UK Information Commisioner's Office (ICO) | 3-April-2023 | Generative AI: eight questions that developers and users need to ask |
UK National Cyber Security Centre (NCSC) | 2-June-2023 | ChatGPT and large language models: what's the risk? |
UK National Cyber Security Centre (NCSC) | 31 August 2022 | Principles for the security of machine learning |
European Parliament | 31 August 2022 | EU AI Act: first regulation on artificial intelligence |
Publication | Author | Date | Title and Link |
---|---|---|---|
Arxiv | Samuel Gehman, et al | 24-Sep-20 | REALTOXICITYPROMPTS: Evaluating Neural Toxic Degeneration in Language Models |
Arxiv | Fabio Perez, Ian Ribeiro | 17-Nov-22 | Ignore Previous Prompt: Attack Techniques For Language Models |
Arxiv | Nicholas Carlini, et al | 14-Dec-20 | Extracting Training Data from Large Language Models |
NCC Group | Chris Anley | 06-Jul-22 | Practical Attacks on Machine Learning Systems |
NCC Group | Jose Selvi | 5-Dec-22 | Exploring Prompt Injection Attacks |
Arxiv | Varshini Subhash | 22-Feb-2023 | Can Large Language Models Change User Preference Adversarially? |
? | Jing Yang et al | 23 May 2023 | A Systematic Literature Review of Information Security in Chatbots |
Arxiv | Isaac et al | 18 April 2023 | AI Product Security: A Primer for Developers |
OpenAI | OpenAI | 15-Mar-23 | GPT-4 Technical Report |
Arxiv | Kai Greshake, et al | 05-May-23 | Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection |
Arxiv | Alexander Wan, et al | 01-May-23 | Poisoning Language Models During Instruction Tuning |
Arxiv | Leon Derczynski, et al | 31-Mar-23 | Assessing Language Model Deployment with Risk Cards |
Arxiv | Jan von der Assen, et al | 11-Mar-24 | Asset-driven Threat Modeling for AI-based Systems |
Publication | Author | Date | Title and Link |
---|---|---|---|
Deloitte | Deloitte AI Institute | 13-Mar-23 | A new frontier in artificial intelligence - Implications of Generative AI for businesses |
Team8 | Team8 CISO Village | 18-Apr-23 | Generative AI and ChatGPT Enterprise Risks |
Trail of Bits | Heidy Khlaaf | 7-Mar-23 | Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems |
Security Implications of ChatGPT | Cloud Security Alliance (CSA) | 23-Apr-2023 | Security Implications of ChatGPT |
Service | Channel | Date | Title and Link |
---|---|---|---|
YouTube | LLM Chronicles | 29-Mar-24 | Prompt Injection in LLM Browser Agents |
YouTube | Layerup | 03-Mar-24 | GenAI Worms Explained: The Emerging Cyber Threat to LLMs |
YouTube | RALFKAIROS | 05-Feb-23 | ChatGPT for Attack and Defense- AI Risks: Privacy, IP, Phishing, Ransomware-By Avinash Sinha |
YouTube | AI Explained | 25-Mar-23 | 'Governing Superintelligence' - Synthetic Pathogens, The Tree of Thoughts Paper and Self-Awareness |
YouTube | LiveOverflow | 14-Apr-23 | 'Attacking LLM - Prompt Injection' |
YouTube | LiveOverflow | 27-Apr-23 | 'Accidental LLM Backdoor - Prompt Tricks' |
YouTube | LiveOverflow | 11-May-23 | 'Defending LLM - Prompt Injection' |
YouTube | Cloud Security Podcast | 30-May-23 | 'CAN LLMs BE ATTACKED!' |
YouTube | API Days | 28-Jun-23 | Language AI Security at the API level: Avoiding Hacks, Injections and Breaches |
Service | Channel | Date | Title and Link |
---|---|---|---|
YouTube | API Days | 28-Jun-23 | Securing LLM and NLP APIs: A Journey to Avoiding Data breaches, Attacks and More |
Name | Type | Note | Link |
---|---|---|---|
SecDim | Attack and Defence | An attack and defence challenge where players should protect their chatbot secret phrase while attacking other players chatbot to exfiltrate theirs. | https://play.secdim.com/game/ai-battle |
GPT Prompt Attack | Attack | Goal of this game is to come up with the shortest user input that tricks the system prompt into returning the secret key back to you. | https://ggpt.43z.one |
Gandalf | Attack | Your goal is to make Gandalf reveal the secret password for each level. However, Gandalf will level up each time you guess the password, and will try harder not to give it away | https://gandalf.lakera.ai |