-
Stealing finetuning data with corrupted models
Can corrupted transformer models steal sensitive finetuning data through maliciously inserted "data traps"?
-
AI’s role in cybersecurity
How will Artificial Intelligence change cybersecurity, and what are the implications for Europe?
-
Are aligned neural networks adversarially aligned?
Notes from Are aligned neural networks adversarially aligned?, Carlini, N., and others. NeurIPS 2023.
-
Happy New Year! Freud x Barto
Can Freud's tripartite personality model help design better reinforcement learning agents?
-
Sherlock Holmes is as good as 48% dead when his train pulls out from Victoria Station
How did von Neumann derive this claim using game theory?