From Similarity to Vulnerability: Key Collision Attack on LLM...
Semantic caching has emerged as a pivotal technique for scaling LLM applications, widely adopted by major providers including AWS and Microsoft. By utilizing semantic embedding vectors as cache...
arxiv.org