Researchers unveil seed hijacking attack on LLM sampling.

Ziyang You and colleagues have published a paper on arXiv titled "Seed Hijacking of LLM Sampling and Quantum Random Number Defense" ¹. The study explores vulnerabilities in large language models (LLMs) related to their reliance on pseudorandom number generators (PRNGs) for autoregressive sampling.

The research identifies a critical supply-chain attack surface that existing defenses have overlooked ¹. The authors introduce SeedHijack, a backdoor attack designed to manipulate PRNG outputs ¹. This manipulation allows for the selection of attacker-specified tokens without modifying model logits ¹.

The attack was tested on GPT-2 (124M) across 540 trials, achieving 99.6% exact token injection across nine sampling configurations ¹. The researchers also found the attack reached 100% success on four aligned models (1.5B-7B, RLHF/SFT/reasoning distillation) ¹. Furthermore, the attack bypassed all alignment methods tested in the study ¹.

The paper also proposes a defense mechanism based on a hardware quantum random number generator (QRNG) ¹. This defense aims to neutralize the attack within the evaluated threat model ¹. The QRNG-based defense demonstrated negligible median overhead, with a +0.6% latency increase and +7.7 MB memory usage ¹.

The study's abstract highlights a critical sampling-layer vulnerability and offers a practical, deployable QRNG-based defense ¹. The research falls under the subjects of Cryptography and Security, Artificial Intelligence, and Machine Learning ¹.

The paper is available on arXiv with the identifier 2605.08313 ¹.

The authors are Ziyang You, Xiaoke Yang, Zhanling Fan, Feng Guo, Xiaogen Zhou, and Xuxing Lu ¹.

How this was made. This article was assembled by Startupniti's editorial AI from the source listed in the right rail. The synthesis ran through our 4-model cascade (Gemini Flash Lite → GPT-4o-mini → DeepSeek → Llama 3.3 70B), logged to ops.llm_calls. Every fact traces to a citation. If a fact looks wrong, write to corrections.