Kasun Dewage, Marianna Pensky, Suranadi De Silva, Shankadeep Mondal
CRAFT is a parameter-efficient fine-tuning method using Tucker decomposition on pre-trained attention weights, achieving competitive performance with minimal adaptation parameters.
The paper introduces a new method called CRAFT for fine-tuning large language models more efficiently. CRAFT uses a mathematical technique called Tucker decomposition to break down the pre-trained attention weights in a model into simpler parts. These parts are then adjusted using small, trainable matrices, allowing the model to be fine-tuned with fewer parameters. This method is shown to perform well on a standard set of language tasks, requiring fewer resources than traditional methods.