نوع مقاله : مقاله پژوهشی
نویسندگان
1 دانشکده مهندسی کامپیوتر - بخش مهندسی نرمافزار، دانشگاه یزد
2 کارشناسی ارشد مهندسی کامپیوتر، دانشگاه یزد، یزد، ایران
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسندگان [English]
In today's society, news and advertisements have a special place in the growth and development of society. By specifying the main words of the ad, you can understand its general meaning. Preparing these words in the traditional way requires time and specialized knowledge about the subject of the text. Ideakav site is a system that collects Telegram messages and advertisements. The requirement of the idea search system was to extract keywords from the advertisements published in Telegram. The quality of extracted keywords plays a significant role in improving SEO and advertising statistics. By using embedding algorithms, it is possible to extract colloquial conversations and the semantic structure of the text, therefore, it is useful in identifying keywords in Telegram ads that are often published in popular form. In this research, a model of word embedding has been implemented using the data of the idea mining system. The innovation used in this research is created by combining word embedding methods, word frequency and word position. The embedding model is created from two-word words. Creating a model of two-word words is because most of the keywords consist of two words or more. In order to better display the evaluations, the IK model (proposed model) has been compared with statistical methods and graph-based methods, and the obtained results show that the combination of the two-gram IK model has produced a better performance in extracting keywords than other methods.
کلیدواژهها [English]