OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions

Now we will have some grounding for when weird ChatGPT behaviors are intended or side-effects -- shrinking the Overton window of RLHF bugs.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/openai-rlhf-model-spec00:00 OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions02:56 Reviewing the Model Spec08:26 Where RLHF can fail OpenAI12:23 From Model Spec's to personalizationFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_027.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_029.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_033.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_034.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_041.webpFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_046.webp This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Om Podcasten

Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai