PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion

arXivSource

Seymanur Akti, Tuan Nam Nguyen, Alexander Waibel

cs.AI
|
Jun 4, 2025
2 views

One-line Summary

The study improves expressive voice conversion by enhancing style transfer and reducing source timbre leakage using a non-autoregressive framework with a conditional variational autoencoder.

Plain-language Overview

This research focuses on improving technology that can change the voice in a recording to sound like someone else while also adopting the same emotional tone. The authors developed a method that better separates the original voice's characteristics from the desired new voice and style. They achieved this by using advanced techniques to represent the content and style of speech separately, making the voice conversion more accurate and expressive. The results show that their method is better at transferring emotions and speaker identity compared to previous approaches.

Technical Details