This paper studies how a jailbreak can propagate across multi-agent systems and proposes a foresight-guided defense to stop the spread early. It matters for builders shipping agent swarms, where one compromised agent can cascade into a system-wide failure.
arXiv:2605.01758v1 Announce Type: new Abstract: Large multimodal model-based Multi-Agent Systems (MASs) enable collaborative complex problem solving through specialized agents. However, MASs are vulnerable to infectious jailbreak, where compromising a single agent can spread to others, leading to widespread compromise. Existing defenses counter this by training a more contagious cure factor, biasing agents to retrieve it over virus adversarial examples (VirAEs). However, this homogenizes agent…