CodeTool: Process Supervision for Enhanced LLM Tool Invocation

We discuss CodeTool, a novel framework that enhances how large language models utilize external tools by generating and supervising code execution step-by-step. This approach uses process rewards: an "On-the-spot Reward" for immediate code correctness and a "Latent Reward" to guide towards effective problem-solving paths, with the latter estimated by a trained model. CodeTool leverages the verifiable nature of code for reliable feedback at each stage, overcoming limitations of text or JSON-based tool invocation and improving performance on complex tasks. Experiments on benchmark datasets demonstrate that CodeTool outperforms existing methods by ensuring more accurate and efficient tool use.

Om Podcasten