[1] H. Kwak, C. Lee, H. Park, and S. Moon, “What is Twitter, a social network or a news media?,” in Proceedings of the 19th international conference on World wide web, 2010, pp. 591–600.
[2] H.-T. Liao, K. Fu, and S. A. Hale, “How much is said in a microblog? A multilingual inquiry based on Weibo and Twitter,” in Proceedings of the ACM Web Science Conference, 2015, pp. 1–9.
[3] T. Lin, W. Tian, Q. Mei, and H. Cheng, “The dual-sparse topic model: mining focused topics and focused terms in short text,” in Proceedings of the 23rd international conference on World wide web, 2014, pp. 539–550.
[4] J. Qiang, P. Chen, W. Ding, T. Wang, F. Xie, and X. Wu, “Topic discovery from heterogeneous texts,” in 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), 2016, pp. 196–203.
[5] T. Shi, K. Kang, J. Choo, and C. K. Reddy, “Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations,” in Proceedings of the 2018 world wide web conference, 2018, pp. 1105–1114.
[6] T. Ramamoorthy, V. Kulothungan, and B. Mappillairaju, “Topic modeling and social network analysis approach to explore diabetes discourse on Twitter in India,” Front. Artif. Intell., vol. 7, p. 1329185, 2024.
[7] F. Zhang, W. Gao, Y. Fang, and B. Zhang, “Enhancing short text topic modeling with FastText embeddings,” in 2020 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), 2020, pp. 255–259.
[8] M. Asgari-Chenaghlu, M.-R. Feizi-Derakhshi, L. Farzinvash, M.-A. Balafar, and C. Motamed, “Topic detection and tracking techniques on Twitter: a systematic review,” Complexity, vol. 2021, no. 1, p. 8833084, 2021.
[9] M. R. Khadivi, S. Akbarpour, M.-R. Feizi-Derakhshi, and B. Anari, “A Human Word Association based model for topic detection in social networks,” arXiv Prepr. arXiv2301.13066, 2023.
[10] P. Kherwa and P. Bansal, “Topic modeling: a comprehensive review,” EAI Endorsed Trans. scalable Inf. Syst., vol. 7, no. 24, 2019.
[11] I. Vayansky and S. A. P. Kumar, “A review of topic modeling methods,” Inf. Syst., vol. 94, p. 101582, 2020.
[12] A. Abdelrazek, Y. Eid, E. Gawish, W. Medhat, and A. Hassan, “Topic modeling algorithms and applications: A survey,” Inf. Syst., vol. 112, p. 102131, 2023.
[13] R. Churchill and L. Singh, “The evolution of topic modeling,” ACM Comput. Surv., vol. 54, no. 10s, pp. 1–35, 2022.
[14] J. Boyd-Graber, Y. Hu, and D. Mimno, “Applications of topic models,” Found. Trends® Inf. Retr., vol. 11, no. 2–3, pp. 143–296, 2017.
[15] X. Wu, T. Nguyen, and A. T. Luu, “A survey on neural topic models: methods, applications, and challenges,” Artif. Intell. Rev., vol. 57, no. 2, p. 18, 2024.
[16] C. Jacobi, W. Van Atteveldt, and K. Welbers, “Quantitative analysis of large amounts of journalistic texts using topic modelling,” in Rethinking Research Methods in an Age of Digital Journalism, Routledge, 2018, pp. 89–106.
[17] A. T. Han, L. Laurian, and J. Dewald, “Plans versus political priorities: Lessons from municipal election candidates’ social media communications,” J. Am. Plan. Assoc., vol. 87, no. 2, pp. 211–227, 2021.
[18] N. N. Haghighi, X. C. Liu, R. Wei, W. Li, and H. Shao, “Using Twitter data for transit performance assessment: a framework for evaluating transit riders’ opinions about quality of service,” Public Transp., vol. 10, pp. 363–377, 2018.
[19] C. A. Bail et al., “Exposure to opposing views on social media can increase political polarization,” Proc. Natl. Acad. Sci., vol. 115, no. 37, pp. 9216–9221, 2018.
[20] H. Pousti, C. Urquhart, and H. Linger, “Researching the virtual: A framework for reflexivity in qualitative social media research,” Inf. Syst. J., vol. 31, no. 3, pp. 356–383, 2021.
[21] E. Schubert, M. Weiler, and H.-P. Kriegel, “Signitrend: scalable detection of emerging topics in textual streams by hashed significance thresholds,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 871–880.
[22] C. K. Vaca, A. Mantrach, A. Jaimes, and M. Saerens, “A time-based collective factorization for topic discovery and monitoring in news,” in Proceedings of the 23rd international conference on World wide web, 2014, pp. 527–538.
[23] C.-H. Lee, T.-F. Chien, and H.-C. Yang, “An automatic topic ranking approach for event detection on microblogging messages,” in 2011 IEEE International Conference on Systems, Man, and Cybernetics, 2011, pp. 1358–1363.
[24] Y. Du, Y. Yi, X. Li, X. Chen, Y. Fan, and F. Su, “Extracting and tracking hot topics of micro-blogs based on improved Latent Dirichlet Allocation,” Eng. Appl. Artif. Intell., vol. 87, p. 103279, 2020.
[25] X. Liu, Y. Gao, Z. Cao, and G. Sun, “LDA-based Topic Mining of Microblog Comments,” in Journal of Physics: Conference Series, 2021, vol. 1757, no. 1, p. 12118.
[26] M. Sadeghi and J. Vegas, “How well does Google work with Persian documents?,” J. Inf. Sci., vol. 43, no. 3, pp. 316–327, 2017.
[27] Z. Mottaghinia, M.-R. Feizi-Derakhshi, L. Farzinvash, and P. Salehpour, “A Review of Approaches for Topic Detection in Twitter,” J. Exp. Theor. Artif. Intell., 2021.
[28] H. Becker, M. Naaman, and L. Gravano, “Beyond trending topics: Real-world event identification on twitter,” in Proceedings of the international AAAI conference on web and social media, 2011, vol. 5, no. 1, pp. 438–441.
[29] X. Zhou and L. Chen, “Event detection over twitter social media streams,” VLDB J., vol. 23, no. 3, pp. 381–400, 2014.
[30] B. O’Connor, M. Krieger, and D. Ahn, “TweetMotif: Exploratory Search and Topic Summarization for Twitter.,” in ICWSM, 2010, pp. 384–385.
[31] S. Phuvipadawat and T. Murata, “Breaking news detection and tracking in Twitter,” in Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, 2010, vol. 3, pp. 120–123.
[32] J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling, “Twitterstand: news in tweets,” in Proceedings of the 17th acm sigspatial international conference on advances in geographic information systems, 2009, pp. 42–51.
[33] S. Petrović, M. Osborne, and V. Lavrenko, “Streaming first story detection with application to twitter,” in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, pp. 181–189.
[34] M.-R. Feizi-Derakhshi, Z. Mottaghinia, and M. Asgari-Chenaghlu, “Persian Text Classification Based on Deep Neural Networks,” Soft Comput. J., vol. 11, no. 1, 2022.
[35] L. M. Aiello et al., “Sensing trending topics in Twitter,” IEEE Trans. Multimed., vol. 15, no. 6, pp. 1268–1282, 2013.
[36] S. Gaglio, G. Lo Re, and M. Morana, “Real-time detection of twitter social events from the user’s perspective,” in 2015 IEEE International Conference on Communications (ICC), 2015, pp. 1207–1212.
[37] J. Huang, M. Peng, and H. Wang, “Topic detection from large scale of microblog stream with high utility pattern clustering,” in Proceedings of the 8th Workshop on Ph. D. Workshop in Information and Knowledge Management, 2015, pp. 3–10.
[38] C. Li, A. Sun, and A. Datta, “Twevent: segment-based event detection from tweets,” in Proceedings of the 21st ACM international conference on Information and knowledge management, 2012, pp.
[39] M. Mathioudakis and N. Koudas, “Twittermonitor: trend detection over the twitter stream,” in Proceedings of the 2010 ACM SIGMOD International Conference on Management of data,
[40] G. Petkos, S. Papadopoulos, L. Aiello, R. Skraba, and Y. Kompatsiaris, “A soft frequent pattern mining approach for textual topic detection,” in Proceedings of the 4th international conference on web intelligence, mining and semantics (WIMS14), 2014, pp. 1–10.
[41] J. Weng and B.-S. Lee, “Event detection in twitter,” in Proceedings of the international aaai conference on web and social media, 2011, vol. 5, no. 1, pp. 401–408.
[42] W. Zhang, T. Yoshida, X. Tang, and Q. Wang, “Text clustering using frequent itemsets,” Knowledge-Based Syst., vol. 23, no. 5, pp. 379–388, 2010.
[43] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” J. Mach. Learn. Res., vol. 3, no. Jan, pp. 993–1022, 2003.
[44] T. Hofmann, “Probabilistic latent semantic indexing,” in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, 1999.
[45] H. D. Kim, D. H. Park, Y. Lu, and C. Zhai, “Enriching text representation with frequent pattern mining for probabilistic topic modeling,” Proc. Am. Soc. Inf. Sci. Technol., vol. 49, no. 1, pp. 1–10, 2012.
[46] D. Quercia, H. Askham, and J. Crowcroft, “Tweetlda: supervised topic classification and link prediction in twitter,” in Proceedings of the 4th Annual ACM Web Science Conference, 2012, pp. 247–250.
[47] C. Li et al., “Twiner: named entity recognition in targeted twitter stream,” in Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, 2012, pp. 721–730.
[48] A. Benny and M. Philip, “Keyword Based Tweet Extraction and Detection of Related Topics,” Procedia Comput. Sci., vol. 46, pp. 364–371, 2015.
[49] M. Ranjbar-Khadivi, S. Akbarpour, M.-R. Feizi-Derakhshi, and B. Anari, “Persian topic detection based on Human Word association and graph embedding,” arXiv Prepr. arXiv2302.09775, 2023.
[50] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710.
[51] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864.
[52] L. McInnes, J. Healy, and J. Melville, “Umap: Uniform manifold approximation and projection for dimension reduction,” arXiv Prepr. arXiv1802.03426, 2018.
[53] R. J. G. B. Campello, D. Moulavi, and J. Sander, “Density-based clustering based on hierarchical density estimates,” in Pacific-Asia conference on knowledge discovery and data mining, 2013, pp. 160–172.
[54] Y.-W. Seo and K. Sycara, Text clustering for topic detection. Carnegie Mellon University, the Robotics Institute, 2004.
[55] M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in Proceedings of the 21st ACM international conference on Information and knowledge management, 2012, pp. 55–64.
[56] H.-J. Choi and C. H. Park, “Emerging topic detection in twitter stream based on high utility pattern mining,” Expert Syst. Appl., vol. 115, pp. 27–36, 2019.
[57] Z. Chen and B. Liu, “Topic modeling using topics from many domains, lifelong learning and big data,” in International conference on machine learning, 2014, pp. 703–711.
[58] Z. Chen and B. Liu, “Mining topics in documents: standing on the shoulders of big data,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 1116–1125.
[59] D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” Adv. Neural Inf. Process. Syst., vol. 13, 2000.
[60] C. Févotte and J. Idier, “Algorithms for nonnegative matrix factorization with the β-divergence,” Neural Comput., vol. 23, no. 9, pp. 2421–2456, 2011.
[61] Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji, “A novel neural topic model and its supervised extension,” in Proceedings of the AAAI Conference on artificial intelligence, 2015, vol. 29, no. 1.
[62] S. Terragni, E. Fersini, B. G. Galuzzi, P. Tropeano, and A. Candelieri, “OCTIS: Comparing and optimizing topic models is simple!,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2021, pp. 263–270.
[63] H. Larochelle and S. Lauly, “A neural autoregressive topic model,” Adv. Neural Inf. Process. Syst., vol. 25, 2012.
[64] H. Zhao, D. Phung, V. Huynh, Y. Jin, L. Du, and W. Buntine, “Topic modelling meets deep neural networks: A survey,” arXiv Prepr. arXiv2103.00498, 2021.
[65] M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” arXiv Prepr. arXiv2203.05794, 2022.
[66] D. Angelov, “Top2vec: Distributed representations of topics,” arXiv Prepr. arXiv2008.09470, 2020.
[67] H. Rahimi, H. Naacke, C. Constantin, and B. Amann, “ANTM: Aligned Neural Topic Models for Exploring Evolving Topics,” in Transactions on Large-Scale Data-and Knowledge-Centered Systems LVI: Special Issue on Data Management-Principles, Technologies, and Applications, Springer, 2024, pp. 76–97.
[68] D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson, “Improving topic models with latent feature word representations,” arXiv Prepr. arXiv1810.06306, 2018.
[69] Y. Liu, Z. Liu, T.-S. Chua, and M. Sun, “Topical Word Embeddings.,” in AAAI, 2015, pp. 2418–2424.
[70] J. Qiang, P. Chen, T. Wang, and X. Wu, “Topic modeling over short texts by incorporating word embeddings,” in Advances in Knowledge Discovery and Data Mining: 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part II 21, 2017, pp. 363–374.
[71] M. Shi, J. Liu, D. Zhou, M. Tang, and B. Cao, “WE-LDA: a word embeddings augmented LDA model for web services clustering,” in 2017 ieee international conference on web services (icws), 2017, pp. 9–16.
[72] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv Prepr. arXiv1301.3781, 2013.
[73] R. Das, M. Zaheer, and C. Dyer, “Gaussian LDA for topic models with word embeddings,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015, pp. 795–804.
[74] W. Gao, M. Peng, H. Wang, Y. Zhang, Q. Xie, and G. Tian, “Incorporating word embeddings into topic modeling of short text,” Knowl. Inf. Syst., vol. 61, pp. 1123–1145, 2019.
[75] C. Li, H. Wang, Z. Zhang, A. Sun, and Z. Ma, “Topic Modeling for Short Texts with Auxiliary Word Embeddings,” Sigir, no. September, pp. 165–174, 2016.
[76] L. McInnes and J. Healy, “Accelerated hierarchical density based clustering,” in 2017 IEEE international conference on data mining workshops (ICDMW), 2017, pp. 33–42.
[77] T. Joachims, “A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization,” in ICML, 1997, vol. 97, pp. 143–151.
[78] F. Bianchi, S. Terragni, and D. Hovy, “Pre-training is a hot topic: Contextualized document embeddings improve topic coherence,” arXiv Prepr. arXiv2004.03974, 2020.
[79] N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv Prepr. arXiv1908.10084, 2019.
[80] A. Srivastava and C. Sutton, “Autoencoding variational inference for topic models,” arXiv Prepr. arXiv1703.01488, 2017.
[81] N. Reimers and I. Gurevych, “Making monolingual sentence embeddings multilingual using knowledge distillation,” arXiv Prepr. arXiv2004.09813, 2020.
[82] N. Thakur, N. Reimers, J. Daxenberger, and I. Gurevych, “Augmented SBERT: Data augmentation method for improving bi-encoders for pairwise sentence scoring tasks,” arXiv Prepr. arXiv2010.08240, 2020.
[83] M. Farahani, M. Gharachorloo, M. Farahani, and M. Manthouri, “Parsbert: Transformer-based model for persian language understanding,” Neural Process. Lett., vol. 53, pp. 3831–3847, 2021.
[84] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186.
[85] T. Pires, “How multilingual is multilingual BERT,” arXiv Prepr. arXiv1906.01502, 2019.
[86] D. P. Kingma, “Auto-encoding variational bayes,” arXiv Prepr. arXiv1312.6114, 2013.
[87] Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” Int. J. Mach. Learn. Cybern., vol. 1, pp. 43–52, 2010.
[88] Y. Miao, L. Yu, and P. Blunsom, “Neural variational inference for text processing,” in International conference on machine learning, 2016, pp. 1727–1736.
[89] G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Comput., vol. 14, no. 8, pp. 1771–1800, 2002.
[90] F. Bianchi, S. Terragni, D. Hovy, D. Nozza, and E. Fersini, “Cross-lingual contextualized topic models with zero-shot learning,” arXiv Prepr. arXiv2004.07737, 2020.
[91] M. Ranjbar-Khadivi, M.-R. Feizi-Derakhshi, A. Forouzandeh, P. Gholami, A.-R. Feizi-Derakhshi, and E. Zafarani-Moattar, “Sep TD Tel01,” 2022.
[92] F. Nan, R. Ding, R. Nallapati, and B. Xiang, “Topic modeling with wasserstein autoencoders,” arXiv Prepr. arXiv1907.12374, 2019.
[93] D. Newman, J. H. Lau, K. Grieser, and T. Baldwin, “Automatic evaluation of topic coherence,” in Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, 2010, pp. 100–108.
[94] N. Aletras and M. Stevenson, “Evaluating topic coherence using distributional semantics,” in Proceedings of the 10th international conference on computational semantics (IWCS 2013)–Long Papers, 2013, pp. 13–22.
[95] G. Carbone and G. Sarti, “ETC-NLG: End-to-end topic-conditioned natural language generation,” IJCoL. Ital. J. Comput. Linguist., vol. 6, no. 6–2, pp. 61–77, 2020.
[96] W. Webber, A. Moffat, and J. Zobel, “A similarity measure for indefinite rankings,” ACM Trans. Inf. Syst., vol. 28, no. 4, pp. 1–38, 2010.
[97] S. Terragni, D. Nozza, E. Fersini, and M. Enza, “Which matters most? comparing the impact of concept and document relationships in topic models,” in Proceedings of the First Workshop on Insights from Negative Results in NLP, 2020, pp. 32–40.
[98] K. Murakami, N. Itsubo, and K. Kuriyama, “Explaining the diverse values assigned to environmental benefits across countries,” Nat. Sustain., vol. 5, no. 9, pp. 753–761, 2022.
[99] J. H. Lau, D. Newman, and T. Baldwin, “Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality,” in Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014, pp. 530–539.