PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

Structured Document Translation via Format Reinforcement Learning

arXivSource

Haiyue Song, Johannes Eschbach-Dymanus, Hour Kaing, Sumire Honda, Hideki Tanaka, Bianka Buschbeck, Masao Utiyama

cs.AI
|
Dec 4, 2025
5 views

One-line Summary

The paper introduces Format Reinforcement Learning (FormatRL) to improve structured document translation by optimizing structure-aware rewards, achieving better results on SAP documentation benchmarks.

Plain-language Overview

This research tackles the challenge of translating structured documents like XML or HTML, which contain complex formatting beyond simple sentences. The authors developed a new method called Format Reinforcement Learning (FormatRL) that uses advanced techniques to improve how well the translated document retains its original structure and meaning. By focusing on both the translation quality and the structural similarity of the document, their approach shows significant improvements over existing methods, especially in translating technical documentation.

Technical Details