RLHF: A thin line between useful and lobotomized

Many, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/how-rlhf-works-200:00 How RLHF works, part 2: A thin line between useful and lobotomized04:27 The chattiness paradox08:09 The mechanism for making models chattier10:42 Next steps for RLHF researchFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webpFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png Get full access to Interconnects at www.interconnects.ai/subscribe

Om Podcasten

Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai