A Comparative Analysis of Transformer and LSTM Architectures for Text Summarization: A Case Study on News and Scientific Article Corpora

Authors

  • Gaurav Tamrakar Assistant Professor, Department of Mechanical, Kalinga University, Raipur, India. Author

Keywords:

Text Summarization, Transformer Architecture, LSTM, Abstractive Summarization, Extractive Summarization, ROUGE Metrics, Deep Learning, NLP

Abstract

Text summarization has become a cornerstone task in natural language processing (NLP), playing a crucial role in transforming vast volumes of unstructured text into concise and meaningful representations. With the rise of deep learning, models such as Long Short-Term Memory (LSTM) networks and Transformer-based architectures have been extensively adopted for both extractive and abstractive summarization tasks. This study conducts a detailed comparative case analysis of these two prominent architectures, focusing on their performance, scalability, and deployment feasibility across different textual domains. Two diverse datasets—CNN/DailyMail representing news articles and arXiv abstracts representing scientific literature—were selected to evaluate the models in terms of linguistic diversity and structural complexity. The models were benchmarked using widely accepted evaluation metrics including ROUGE-1, ROUGE-2, and ROUGE-L to assess content overlap and summary quality, while also analyzing computational cost, memory usage, and inference latency. Results from empirical testing reveal that LSTM-based summarization models are effective in generating grammatically coherent and human-readable summaries, particularly for shorter texts. However, they struggle with longer input sequences and lack scalability. In contrast, Transformer-based models demonstrate superior performance in handling longer documents, delivering higher ROUGE scores and significantly reduced inference time due to their parallelizable architecture. Despite their advantages, Transformers demand higher GPU memory and computational power, which can be a constraint in real-time or low-resource applications. Additionally, qualitative assessments indicate that Transformers occasionally generate content that, while fluent, may not be entirely faithful to the source material—highlighting the importance of further improving factual consistency. This case study not only benchmarks the relative strengths and weaknesses of these architectures but also explores practical deployment considerations, offering valuable insights for researchers and developers aiming to build efficient, scalable, and accurate text summarization systems. Future work may explore hybrid models that integrate the interpretability of LSTMs with the efficiency of Transformers.

Downloads

Published

2024-12-06

Issue

Section

Articles

How to Cite

Gaurav Tamrakar. (2024). A Comparative Analysis of Transformer and LSTM Architectures for Text Summarization: A Case Study on News and Scientific Article Corpora. SECITS Journal of Scalable Distributed Computing and Pipeline Automation, 1(1), 24-31. https://www.secitsociety.org/index.php/SJSDCPA/article/view/158