Alexander Wei¶ Researcher at UC Berkeley working on large language model safety, adversarial robustness, and understanding failure modes of safety training in neural networks. Sources in this wiki¶ Jailbroken: How Does LLM Safety Training Fail? Topics¶ LLM Safety and Adversarial Robustness, Adversarial Attacks