Ziliang Zhao, Bi Xue, Emma Lin, Mengjiao Zhou, Kaustubh Vartak, Shakhzod Ali-Zade, Carson Lu, Tao Li, Bin Kuang, Rui Jian, Bin Wen, Dennis van der Staay, Yixin Bao, Eddy Li, Chao Deng, Songbin Liu, Qifan Wang, Kai Ren
The Multi-Probe Zero Collision Hash (MPZCH) effectively prevents embedding collisions in large-scale recommendation systems, improving model freshness and performance.
In large-scale recommendation systems, embedding tables are used to convert a vast number of unique IDs into dense vectors for better data representation. However, traditional methods often suffer from 'collisions' where different IDs end up with the same vector, reducing the system's effectiveness. The new method, MPZCH, addresses this problem by using a sophisticated hashing technique that virtually eliminates these collisions and keeps the model up-to-date. This approach improves the quality of recommendations without slowing down the system, and it's available for use in the TorchRec library.