Alexander Pan on the MACHIAVELLI benchmark

I've talked to Alexander Pan, 1st year at Berkeley working with Jacob Steinhardt about his paper "Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark" accepted as oral at ICML. Youtube: https://youtu.be/MjkSETpoFlY Paper: https://arxiv.org/abs/2304.03279

Om Podcasten

The goal of this podcast is to create a place where people discuss their inside views about existential risk from AI.