Enough Coin Flips Can Make LLMs Act Bayesian

This paper investigates whether Large Language Models (LLMs) utilize in-context learning (ICL) to perform reasoning consistent with a Bayesian framework. By using a simplified setting of biased coin flips and dice rolls, the authors analyze how LLMs update their internal probabilities based on provided examples. They find that LLMs often start with inherent biases (miscalibrated priors) but demonstrate behavior that broadly follows Bayesian updates when given sufficient evidence through ICL. The study indicates that deviations from true Bayesian inference primarily stem from initial poor priors rather than flawed updating mechanisms. Furthermore, the research suggests that attention magnitude has minimal impact on the Bayesian inference process in these models and that instruction-tuned models may exhibit shorter temporal horizons in their updates.

Om Podcasten

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.