Defending LLM - Prompt Injection

Channel:
Subscribers:
921,000
Published on ● Video Link: https://www.youtube.com/watch?v=VbNPZ1n6_vY



Duration: 17:12
45,820 views
2,541


After we explored attacking LLMs, in this video we finally talk about defending against prompt injections. Is it even possible?

Buy my shitty font (advertisement): shop.liveoverflow.com

Watch the complete AI series:
https://www.youtube.com/playlist?list=PLhixgUqwRTjzerY4bJgwpxCLyfqNYwDVB

Language Models are Few-Shot Learners: https://arxiv.org/pdf/2005.14165.pdf
A Holistic Approach to Undesired Content Detection in the Real World: https://arxiv.org/pdf/2208.03274.pdf

Chapters:
00:00 - Intro
00:43 - AI Threat Model?
01:51 - Inherently Vulnerable to Prompt Injections
03:00 - It's not a Bug, it's a Feature!
04:49 - Don't Trust User Input
06:29 - Change the Prompt Design
08:07 - User Isolation
09:45 - Focus LLM on a Task
10:42 - Few-Shot Prompt
11:45 - Fine-Tuning Model
13:07 - Restrict Input Length
13:31 - Temperature 0
14:35 - Redundancy in Critical Systems
15:29 - Conclusion
16:21 - Checkout LiveOverfont

Hip Hop Rap Instrumental (Crying Over You) by christophermorrow
https://soundcloud.com/chris-morrow-3 CC BY 3.0
Free Download / Stream: http://bit.ly/2AHA5G9
Music promoted by Audio Library https://youtu.be/hiYs5z4xdBU

=[ ❤️ Support ]=

→ per Video: https://www.patreon.com/join/liveoverflow
→ per Month: https://www.youtube.com/channel/UClcE-kVhqyiHCcjYwcpfj9w/join

2nd Channel: https://www.youtube.com/LiveUnderflow

=[ 🐕 Social ]=

→ Twitter: https://twitter.com/LiveOverflow/
→ Streaming: https://twitch.tvLiveOverflow/
→ TikTok: https://www.tiktok.com/@liveoverflow_
→ Instagram: https://instagram.com/LiveOverflow/
→ Blog: https://liveoverflow.com/
→ Subreddit: https://www.reddit.com/r/LiveOverflow/
→ Facebook: https://www.facebook.com/LiveOverflow/







Tags:
Live Overflow
liveoverflow
hacking tutorial
how to hack
exploit tutorial
prompt injection
openai
open ai API
llm
llms
llama
palm
llm defense
secure AI
securing llm
large language model
attacking llm
attacking ai
ai security
stop prompt injection