Linwei Zhai, Han Ding, Mingzhi Lin, Cui Zhao, Fei Wang, Ge Wang, Wang Zhi, Wei Xi
VP-VAE introduces a novel approach to Vector Quantized Variational Autoencoders by using adaptive vector perturbation to improve training stability and avoid codebook collapse.
Vector Quantized Variational Autoencoders (VQ-VAEs) are important tools in creating generative models, but they often face issues during training due to their reliance on a discrete codebook. The new approach, called VP-VAE, eliminates the need for this codebook by using a technique called vector perturbation, which makes training more stable. This method injects controlled disturbances in the model's latent space, allowing it to learn more effectively without the typical problems. The researchers also developed a simplified version called FSP, which further enhances performance on tasks like image and audio processing.