Haozhen Zhang, Haodong Yue, Tao Feng, Quanyu Long, Jianzhu Bao, Bowen Jin, Weizhi Zhang, Xiao Li, Jiaxuan You, Chengwei Qin, Wenya Wang
BudgetMem is a runtime memory framework for LLMs that optimizes performance-cost trade-offs using query-aware budget-tier routing across memory modules.
BudgetMem is a new approach designed to improve how large language model (LLM) agents use memory when they need to process information beyond a single context. Unlike traditional methods that prepare memory in advance without considering specific queries, BudgetMem allows for real-time memory management that adjusts to the needs of the task at hand. It uses a system of memory modules, each available in different budget levels, to balance performance and cost effectively. By training a neural network to route tasks through these modules, BudgetMem achieves better accuracy while managing costs efficiently.