Accomplishments
Optimizing summary length and semantics: a reinforcement learning approach to abstractive summarization
- Abstract
In this paper, a novel length-controllable abstractive text summarization model is introduced to generate coherent and semantically enriched summaries that adhere to predefined length constraints. The novel framework has been developed based on reinforcement learning with length adherence rewards, semantic similarity loss, and ROUGE rewards to optimize semantic and overall quality summaries. The addition of length tokens and length-adherent decoding ensures accurate length conformity to the set length restrictions for summaries across different datasets. The experimental results on CNN/DailyMail, XSum, and PubMed datasets show that the model achieves competitive performance against strong baselines such as PEGASUS, BART, T5, and BERTSUM, as measured by ROUGE and semantic similarity, with improved length conformity. The ablation study further confirms the effectiveness of the proposed model by demonstrating improved structural control and stronger semantic alignment with the input text. The framework effectively balances abstraction, informativeness, and factual consistency under length constraints, contributing to controlled text generation and offering potential directions for future research in multi-document summarization and extended context summarization. The study has used GPU-based parallel computations to optimize sequence-level reinforcement learning and long context-based transformer computations due to their high computational costs.