Automated text summarization of Sinhala online articles

Akmal Jahan, M. A. C.; Wijesekara, K. K. C.

Please use this identifier to cite or link to this item: http://ir.lib.seu.ac.lk/handle/123456789/6773

Full metadata record

DC Field	Value	Language
dc.contributor.author	Akmal Jahan, M. A. C.	-
dc.contributor.author	Wijesekara, K. K. C.	-
dc.date.accessioned	2023-08-16T12:37:58Z	-
dc.date.available	2023-08-16T12:37:58Z	-
dc.date.issued	2023-06	-
dc.identifier.citation	Journal of Science, Faculty of Applied Sciences, South Eastern University of Sri Lanka, Vol. 4, (No.1), June 2023, pp. 1-14.	en_US
dc.identifier.issn	2738-2184	-
dc.identifier.uri	https://www.seu.ac.lk/jsc/	-
dc.identifier.uri	http://ir.lib.seu.ac.lk/handle/123456789/6773	-
dc.description.abstract	Information retrieval is one of the major tasks in natural language processing applications. In digitalized world, there is a development of retrieval information from online platforms and there are abundant of information for a specific subject available in online. With the hustle and bustle, readers need to know whether the information is important according to their need within a very short time. Automated text summarization plays a key role in natural language processing applications. Many studies have been explored for summarizing different languages like English, Bengali, Hausa, Chinese, Hindi, etc. However, the local language like Sinhala is still in beginning stage. On the other hand, as a diverse country, there is a community and language diversity in Sri Lanka. Therefore, there are people who have less fluency in Sinhala as their mother-tongue is another local language like Tamil. Social media like Facebook provides platform for translation of content in a different language. However, other online platforms do not provide such translation process of the content. In such scenario, having a short summary of those articles would be an advantageous step for the readers who can easily understand the main idea of the content. Therefore, this work aims to generate an online platform that can provide a good summary for Sinhala language online articles. This research investigates extractive text summarization for Sinhala online articles using some state-of-the art algorithms in NLP applications to select a best suitable method. This work comparatively analyses the performance of TF-IDF (Term Frequency-Inverse Document Frequency) and Text-Rank algorithms for Sinhala language. Performance of the algorithms is evaluated with human generated summary from online sources using ROUGE (Recall Oriented Understudy of Gisting Evaluation) where high ROUGE score (Measure the rate of n-gram overlapping of original text and automated summary) values represent the more accurate automated summary of the article. From the results, the TF-IDF algorithm comparatively performs better for Sinhala online article summarization with medium content size.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Faculty of Applied Sciences, South Eastern University of Sri Lanka, Sammanthurai.	en_US
dc.subject	Text Summarization	en_US
dc.subject	Text-Rank	en_US
dc.subject	TF-IDF	en_US
dc.subject	Sinhala Article	en_US
dc.title	Automated text summarization of Sinhala online articles	en_US
dc.type	Article	en_US
Appears in Collections:	Volume 04 No.1

Files in This Item:

File	Description	Size	Format
Automated text summarization of Sinhala 1-15.pdf		805.04 kB	Adobe PDF	View/Open

Show simple item record