TY - GEN
T1 - What Was Your Prompt? A Remote Keylogging Attack on AI Assistants
AU - Weiss, Roy
AU - Ayzenshteyn, Daniel
AU - Amit, Guy
AU - Mirsky, Yisroel
N1 - Publisher Copyright:
© USENIX Security Symposium 2024.All rights reserved.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. The side-channel reveals the character-lengths of a response's tokens (akin to word lengths). We found that many vendors, including OpenAI and Microsoft, had this side-channel prior to our disclosure. However, inferring a response's content with this side-channel is challenging. This is because, even with knowledge of token-lengths, a response can have hundreds of words resulting in millions of grammatically correct sentences. In this paper, we show how this can be overcome by (1) utilizing the power of a large language model (LLM) to translate these token-length sequences, (2) providing the LLM with inter-sentence context to narrow the search space and (3) performing a known-plaintext attack by fine-tuning the model on the target model's writing style. Using these methods, we were able to accurately reconstruct 27% of an AI assistant's responses and successfully infer the topic from 53% of them. To demonstrate the threat, we performed the attack on OpenAI's ChatGPT-4 and Microsoft's Copilot on both browser and API traffic.
AB - AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. The side-channel reveals the character-lengths of a response's tokens (akin to word lengths). We found that many vendors, including OpenAI and Microsoft, had this side-channel prior to our disclosure. However, inferring a response's content with this side-channel is challenging. This is because, even with knowledge of token-lengths, a response can have hundreds of words resulting in millions of grammatically correct sentences. In this paper, we show how this can be overcome by (1) utilizing the power of a large language model (LLM) to translate these token-length sequences, (2) providing the LLM with inter-sentence context to narrow the search space and (3) performing a known-plaintext attack by fine-tuning the model on the target model's writing style. Using these methods, we were able to accurately reconstruct 27% of an AI assistant's responses and successfully infer the topic from 53% of them. To demonstrate the threat, we performed the attack on OpenAI's ChatGPT-4 and Microsoft's Copilot on both browser and API traffic.
UR - http://www.scopus.com/inward/record.url?scp=85204958095&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85204958095
T3 - Proceedings of the 33rd USENIX Security Symposium
SP - 3367
EP - 3384
BT - Proceedings of the 33rd USENIX Security Symposium
PB - USENIX Association
T2 - 33rd USENIX Security Symposium, USENIX Security 2024
Y2 - 14 August 2024 through 16 August 2024
ER -