Researchers at ETH Zurich created a jailbreak attack that bypasses AI guardrails
Published
10 months ago
on
By
Artificial intelligence models that rely on human feedback to ensure that their outputs are harmless and helpful may be universally vulnerable to so-called ‘poison’ attacks.