Peter Holderrieth, Douglas Chen, Luca Eyring, Ishin Shah, Giri Anantharaman, Yutong He, Zeynep Akata, Tommi Jaakkola, Nicholas Matthew Boffi, Max Simchowitz
Diamond Maps are a new model for generative tasks that efficiently align with user preferences by enabling quick adaptation to rewards during inference.
Generative models are great at creating high-quality outputs, but they struggle to adapt to specific user preferences or constraints after they've been trained. This process, called reward alignment, is usually challenging and inefficient. The authors propose 'Diamond Maps', a new model that is designed to handle reward alignment more effectively. Diamond Maps allow for quick and accurate adaptation to different user-defined goals during the generation process, making them more versatile and powerful than existing methods.