COLLABLLM: LLMs From Passive to Collaborative

The source introduces COLLABLLM, a novel approach to training Large Language Models (LLMs) that transforms them from passive responders into active collaborators in multi-turn conversations. Current LLMs often fall short in complex, open-ended tasks because their training prioritizes single-turn responses, leading to user frustration and inefficiency when initial requests are imprecise. COLLABLLM addresses this by incorporating "Multiturn-aware Rewards" (MR), which leverage forward sampling through a user simulator to estimate the long-term impact of a model's response on the entire conversation, thus promoting more effective and efficient interactions. A large user study involving 201 judges demonstrated that COLLABLLM significantly improved user satisfaction and reduced the time users spent on tasks, showcasing its generalizability and practical benefits in real-world human-LLM collaboration. The paper also provides detailed experimental setups, ablation studies, and safety evaluations, confirming the robust performance and safe application of COLLABLLM.

Om Podcasten