Accidental LLM Backdoor - Prompt Tricks

Channel:

LiveOverflow

Subscribers:

921,000

Published on April 27, 2023 3:48:34 PM ● Video Link: https://www.youtube.com/watch?v=h74oXb4Kk8k

Duration: 12:07

139,111 views

7,632

In this video we explore various prompt tricks to manipulate the AI to respond in ways we want, even when the system instructions want something else. This can help us better understand the limitations of LLMs.

Get my font (advertisement): https://shop.liveoverflow.com

Watch the complete AI series:
https://www.youtube.com/playlist?list=PLhixgUqwRTjzerY4bJgwpxCLyfqNYwDVB

The Game: https://gpa.43z.one
The OpenAI API cost is pretty high, thus if you want to play the game, use the OpenAI Playground with your own account: https://platform.openai.com/playground?mode=chat

Chapters:
00:00 - Intro
00:39 - Content Moderation Experiment with Chat API
02:19 - Learning to Attack LLMs
03:06 - Attack 1: Single Symbol Differences
03:51 - Attack 2: Context Switch to Write Stories
05:20 - Attack 3: Large Attacker Inputs
06:31 - Attack 4: TLDR Backdoor
08:27 - "This is just a game"
08:56 - Attack 5: Different Languages
09:19 - Attack 6: Translate Text
10:30 - Quote about LLM Based Games
11:11 - advertisement shop.liveoverflow.com

=[ ❤️ Support ]=

→ per Video: https://www.patreon.com/join/liveoverflow
→ per Month: https://www.youtube.com/channel/UClcE-kVhqyiHCcjYwcpfj9w/join

2nd Channel: https://www.youtube.com/LiveUnderflow

=[ 🐕 Social ]=

→ Twitter: https://twitter.com/LiveOverflow/
→ Streaming: https://twitch.tvLiveOverflow/
→ TikTok: https://www.tiktok.com/@liveoverflow_
→ Instagram: https://instagram.com/LiveOverflow/
→ Blog: https://liveoverflow.com/
→ Subreddit: https://www.reddit.com/r/LiveOverflow/
→ Facebook: https://www.facebook.com/LiveOverflow/

Other Videos By LiveOverflow

2023-08-18	The Discovery of Zenbleed ft. Tavis Ormandy
2023-08-01	Asking Android Developers About Security at Droidcon Berlin
2023-07-22	Local Root Exploit in HospitalRun Software
2023-07-13	Android App Bug Bounty Secrets
2023-07-03	Generic HTML Sanitizer Bypass Investigation
2023-06-22	Hacking Google Cloud?
2023-06-11	Trying to Find a Bug in WordPress
2023-05-31	Authentication Bypass Using Root Array
2023-05-22	My YouTube Financials - The Future of LiveOverflow
2023-05-11	Defending LLM - Prompt Injection
2023-04-27	Accidental LLM Backdoor - Prompt Tricks
2023-04-14	Attacking LLM - Prompt Injection
2023-04-01	Our Future As Hackers Is At Stake!
2023-03-29	Cyber Security Challenge Germany (2023)
2023-03-20	Cybercrime is Not Hacking!
2023-03-11	Attacking Language Server JSON RPC
2023-03-03	Advanced Teleport Hack (stolen from cheaters)
2023-02-17	VPNs, Proxies and Secure Tunnels Explained (Deepdive)
2023-01-31	Velocity Exploit on Paper?
2023-01-12	I’m moving, no videos sorry
2023-01-01	Computer Networking (Deepdive)

Tags:

Live Overflow

liveoverflow

hacking tutorial

how to hack

exploit tutorial

prompt engineer

openai

gpt-3

gpt-4

chatgpt

openai api

prompt hacking

prompt injection

prompt tricks

tldr

ai backdoor

gpt backdoor

llm

neural network

backdooring

Channel	Latest
尼特山貓	6 hours ago
낫홀	6 hours ago
S.r-Billy	6 hours ago
Wenfei Fu	7 hours ago
EL DIEGUILLO 39	7 hours ago
itFredda	7 hours ago
ULTRA FIGHTERS	7 hours ago
Satō Kurosaki	7 hours ago
ALFRAM	7 hours ago
Attillee	7 hours ago
QuDragon屈龍	7 hours ago
Jusayin Studios	8 hours ago
Lenny Sturgess	8 hours ago
Geek75sg	8 hours ago
Dieison Games	8 hours ago
Reta de Campeones	8 hours ago
Zinumplay lives	8 hours ago
RetroGamingNow	8 hours ago
惡毒の永劫	8 hours ago
PartyChat Podcast	8 hours ago
Kaige-O Gaming	9 hours ago
Invisiblekatana	9 hours ago
GenesisGamma6	9 hours ago
CHM GV	9 hours ago
Podcast sobre Famosas	9 hours ago