A Comprehensive Survey on Text Summarization For Indian Languages: Opportunities, Challenges and Future Prospects
DOI:
https://doi.org/10.70135/seejph.vi.4957Abstract
An increasing quantum of data available on the web, news websites, published articles in various fields of study, and electronic books have generated a valuable resource for extracting and analyzing information. The main challenge for researchers has been that of accessing accurate and reliable data. This information must be summarized to retrieve helpful knowledge within a reasonable period. Text summarization is a crucial Natural Language Processing (NLP) task that aims to condense lengthy documents into shorter, coherent summaries while retaining the essential information. Text summarization is divided into Extractive and Abstractive Summarization. The extractive summarizer extracts the basic sentences or phrases from the original document. In contrast, an Abstractive summarizer generate a summary by rephrasing the original text with new one which is closed to the human-made. With the increasing availability of textual data in multiple Indian languages, effective summarization techniques are required to facilitate content understanding, especially for non-English users. Indian languages, such as Hindi, Tamil, Bengali, Gujarati, and Marathi etc. have complex linguistic structures, making summarization challenging. Countable research works is carried out with extractive methods for Indian language, now researcher move towards the abstractive summarization. This paper presents an overview of summarization techniques, datasets, and evaluation metrics for Indian languages. In addition, the survey represents the Deep Learning-based text summarization with valuable adoption of conventional approaches to uplift the abstractive text summarization.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.