Junmo Cho, Suhan Kim, Sangjune An, Minsu Kim, Dong Bok Lee, Heejun Lee, Sung Ju Hwang, Hae Beom Lee
GFlowPO is a new framework for optimizing language model prompts using a probabilistic approach and dynamic memory updates, leading to better performance in various language tasks.
Optimizing prompts for language models can be very challenging due to the vast space of possible prompts and the difficulty in evaluating them. GFlowPO is a new method that treats prompt optimization as a problem of finding the best possible prompts by considering them as hidden variables that can be inferred. This method uses a two-step process: first, it fine-tunes a model to explore prompts efficiently by reusing past evaluations, and second, it updates the search strategy dynamically to focus on the most promising prompts. This approach has been shown to perform better than existing methods in tasks like text classification and question answering.