PaperPulse - AI/ML Summarization Platform

arXivJun 24, 2025

Reconsidering Explicit Longitudinal Mammography Alignment for Enhanced Breast Cancer Risk Prediction

Solveig Thrun, Stine Hansen et al.

TLDR: This study shows that image-level alignment of mammograms improves breast cancer risk prediction over representation-level alignment, optimizing both alignment quality and predictive accuracy.

arXivJun 24, 2025

Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation

Weichen Zhang, Dong Xu et al.

TLDR: The paper introduces the Collaborative and Adversarial Network (CAN) for unsupervised domain adaptation, achieving state-of-the-art results by combining domain-collaborative and adversarial learning strategies.

arXivJun 24, 2025

Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router

Yubo Huang, Weiqiang Wang et al.

TLDR: Bind-Your-Avatar introduces a novel framework for generating videos with multiple talking characters in the same scene, using a dynamic 3D-mask embedding router to control audio-to-character correspondence and a new dataset for training and benchmarking.

arXivJun 24, 2025

Sampling Matters in Explanations: Towards Trustworthy Attribution Analysis Building Block in Visual Models through Maximizing Explanation Certainty

Róisín Luo, James McDermott et al.

TLDR: This paper presents a semi-optimal sampling approach for image attribution analysis that improves explanation certainty by aligning sample distributions with natural image distributions, outperforming state-of-the-art methods.

arXivJun 24, 2025

Segment Any 3D-Part in a Scene from a Sentence

Hongyu Wu, Pengwan Yang et al.

TLDR: The paper introduces a new dataset and framework for segmenting any 3D part of a scene using natural language descriptions, overcoming data and annotation challenges and demonstrating superior performance in open-vocabulary 3D scene understanding.

arXivJun 24, 2025

Explicit Residual-Based Scalable Image Coding for Humans and Machines

Yui Tatsumi, Ziyue Zeng et al.

TLDR: The paper introduces two methods for scalable image compression, enhancing efficiency and flexibility for both human and machine vision, achieving significant performance improvements over previous models.

ArXivJun 24, 2025

Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning

Harisankar Babu, Philipp Schillinger et al.

TLDR: TAPAS is a multi-agent framework that combines language models with symbolic planning to dynamically solve complex tasks without manual environment models.

arXivJun 24, 2025

CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation

Hao Li, Shuai Yang et al.

TLDR: CronusVLA enhances vision-language-action models by efficiently incorporating multi-frame motion data, achieving state-of-the-art performance in manipulation tasks.

arXivJun 24, 2025

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Ankita Raj, Harsh Swaika et al.

TLDR: The study shows that proprietary medical imaging models are vulnerable to model stealing attacks, even with limited resources, and introduces a new method called QueryWise to enhance these attacks.

arXivJun 24, 2025

Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation

Zhifeng Wang, Renjiao Yi et al.

TLDR: The Angio-Diff model uses a self-supervised diffusion approach to transform non-angiographic X-rays into high-quality angiographic images, addressing data shortages and improving vascular image synthesis accuracy.

arXivJun 24, 2025

Trajectory Prediction in Dynamic Object Tracking: A Critical Study

Zhongping Dong, Liming Chen et al.

TLDR: This study critically analyzes current methodologies in dynamic object tracking and trajectory prediction, highlighting their applications, challenges, and future research directions.

arXivJun 24, 2025

Online camera-pose-free stereo endoscopic tissue deformation recovery with tissue-invariant vision-biomechanics consistency

Jiahe Chen, Naoki Tomii et al.

TLDR: The study presents a method for recovering tissue deformation from stereo endoscopic images without needing to estimate camera pose, achieving high accuracy even in challenging conditions like occlusion.

arXivJun 24, 2025

Progressive Modality Cooperation for Multi-Modality Domain Adaptation

Weichen Zhang, Dong Xu et al.

TLDR: The paper introduces Progressive Modality Cooperation (PMC), a framework for multi-modality domain adaptation that effectively transfers knowledge across domains by leveraging multiple modalities, even when some are missing in the target domain.

arXivJun 24, 2025

Airway Skill Assessment with Spatiotemporal Attention Mechanisms Using Human Gaze

Jean-Paul Ainam, Rahul et al.

TLDR: This paper introduces a machine learning approach using human gaze data to objectively assess airway management skills, improving accuracy and efficiency over traditional methods.

arXivJun 24, 2025

On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls

Emmeran Johnson, David Martínez-Rubio et al.

TLDR: Adaptive regularization is necessary for optimal online learning on high-dimensional $oldsymbol{ ext{ell}_p}$-balls, as fixed regularization cannot achieve optimality across all dimension regimes.

arXivJun 24, 2025

Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls

Yihong Luo, Shuchen Xue et al.

TLDR: Noise Consistency Training (NCT) is a novel approach to add control signals to pre-trained one-step generators without retraining, achieving state-of-the-art results in controllable content generation efficiently.

arXivJun 24, 2025

Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control

Jose Blanchet, Jiayi Cheng et al.

TLDR: The paper introduces a distributionally robust Bayesian control framework for diffusion processes, addressing model misspecification by using adversarial priors within a divergence neighborhood, and provides an efficient algorithm for optimal strategy computation.

arXivJun 24, 2025

A large deviation view of \emph{stationarized} fully lifted blirp interpolation

Mihailo Stojnic

TLDR: This paper extends the large deviation framework for bilinearly indexed random processes to include atypical features and local entropies, enhancing the applicability of previous stationarized interpolation methods.

arXivJun 24, 2025

Systematic Review of Pituitary Gland and Pituitary Adenoma Automatic Segmentation Techniques in Magnetic Resonance Imaging

Mubaraq Yakubu, Navodini Wijethilake et al.

TLDR: This review evaluates automatic segmentation methods for pituitary adenomas and glands in MRI, highlighting the promise of U-Net-based models but noting the need for further improvements and comprehensive reporting of metrics.

arXivJun 24, 2025

Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding

Runwei Guan, Ningwei Ouyang et al.

TLDR: The paper introduces WaterCaption, a dataset for waterway image captioning, and Da Yu, a model that excels in generating detailed captions for waterway environments.

AI Research Paper Feed

Reconsidering Explicit Longitudinal Mammography Alignment for Enhanced Breast Cancer Risk Prediction

Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation

Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router

Sampling Matters in Explanations: Towards Trustworthy Attribution Analysis Building Block in Visual Models through Maximizing Explanation Certainty

Segment Any 3D-Part in a Scene from a Sentence

Explicit Residual-Based Scalable Image Coding for Humans and Machines

Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning

CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation

Trajectory Prediction in Dynamic Object Tracking: A Critical Study

Online camera-pose-free stereo endoscopic tissue deformation recovery with tissue-invariant vision-biomechanics consistency

Progressive Modality Cooperation for Multi-Modality Domain Adaptation

Airway Skill Assessment with Spatiotemporal Attention Mechanisms Using Human Gaze

On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls

Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls

Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control

A large deviation view of \emph{stationarized} fully lifted blirp interpolation

Systematic Review of Pituitary Gland and Pituitary Adenoma Automatic Segmentation Techniques in Magnetic Resonance Imaging

Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding