نوع مقاله : مقاله پژوهشی
نویسندگان
1 دپارتمان فنی مهندسی دانشگاه تبریز - تبریز - ایران
2 استاد گروه مهندسی کامپیوتر - دانشکده مهندسی برق و کامپیوتر دانشگاه تبریز- تبریز- ایران
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Extractive summarization of text is an essential technique in natural language processing, which helps to produce compact versions of text by extracting the most important sentences. Since the task of shortening and summarizing a text document is time-consuming and exhausting, an automatic system for creating these short versions of the text seems necessary. In extractive summarization, sentences that contain useful and relevant information are usually selected for the final summary. In order to identify these sentences, there are different algorithms, the performance and summary created by each one is different based on the type and scope of the text and the size of the required summary. In this article, a method called Sa-TRB is presented, which is derived from two algorithms, TextRank and BERT, and in addition to using these two methods, it also uses the common sentences created by other algorithms to achieve high accuracy in selection. Have final summary sentences. The most important criterion for evaluating the performance of algorithms is the quality of their final summary, so the more the final summary created by these algorithms is similar to the summary created by humans, the better the quality of the created summary is. ROUGE criteria have been used to obtain the size of this similarity. Finally, by conducting experiments on the cnn-dailymail dataset with different sizes of summaries, it is shown that the proposed method, by increasing the size of the required summaries, despite the decrease in the recall criterion, has accuracy, score and, as a result, higher quality of the final summaries. So, in the last two tests, the score of the proposed method has reached 24.68 and 23.34%, which is almost one percent better than the best tested methods.
کلیدواژهها [English]