[1] 丁兆云,贾焰,周斌. 微博数据挖掘研究综述[J]. 计算机研究与发展, 2014,51(4):691-706.
[2] CHEN K Y, LUESUKPRASERT L, CHOU S C T. Hot topic extraction based on timeline analysis and multidimensional sentence modeling[J]. IEEE Transactions on Knowledge & Data Engineering, 2007,19(8):1016-1025.
[3] 路荣,项亮,刘明荣,等. 基于隐主题分析和文本聚类的微博客中新闻话题的发现[J]. 模式识别与人工智能, 2012,25(3):382-387.
[4] YE Y T , DU Y J, FU X. Hot topic extraction based on Chinese microblog’s Features topic model[C]// 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA). 2016:348-353.
[5] 陈珊珊. 基于LDA模型的文本聚类研究[D]. 苏州:苏州大学, 2017.
[6] LIU Z T, YU W C, CHEN W, et al. Short text feature selection for micro-blog mining[C]// 2010 International Conference on Computational Intelligence and Software Engineering. 2010. DOI:10.1109/CISE.2010.5677015.
[7] ZHUANG L, DAI H H. A maximal frequent itemset approach for Web document clustering[C]// 2004 International Conference on Computer and Information Technology. 2004:970-977.
[8] ZHANG W, YOSHIDA T, TANG X J, et al. Text clustering using frequent itemsets[J]. Knowledge-Based Systems, 2010,23(5):379-388.
[9] 徐雅斌,李卓,吕非非,等. 基于频繁词集聚类的微博新话题快速发现[J]. 系统工程理论与实践, 2014,34(S1):276-282.
[10]彭敏,黄佳佳,朱佳晖,等. 基于频繁项集的海量短文本聚类与主题抽取[J]. 计算机研究与发展, 2015,52(9):1941-1953.
[11]HAN J W, PEI J, YIN Y W, et al. Mining frequent patterns without candidate generation: A frequent-pattern tree approach[J]. Data Mining & Knowledge Discovery, 2004,8(1):53-87.
[12]DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805,2018.
[13]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// 2017 Advances in Neural Information Processing Systems. 2017:6000-6010.
[14]GURURANGAN S, MARASOVIC A, SWAYAMDIPTA S, et al. Don’t stop pretraining: Adapt language models to domains and tasks[J]. arXiv preprint arXiv:2004.10964,2020.
[15]FIEDLER M. Algebraic connectivity of graphs[J]. Czechoslovak Mathematical Journal, 1973,23(2):298-305.
[16]吴云,许抗震,黄瑞章. 一种基于Hadoop的文本相似度仿真检测模型[J]. 新疆大学学报(自然科学版), 2017,34(3):308-315.
[17]CHIDEAN M I, MORGADO E, SANROMN-JUNQUERA M, et al. Energy efficiency and quality of data reconstruction through data-coupled clustering for self-organized large-scale WSNs[J]. IEEE Sensors Journal, 2016,16(12):5010-5020.
[18]杨波. 新浪微博热点话题发现研究[D]. 乌鲁木齐:新疆大学, 2019.
[19]HIRSCH J E. An index to quantify an individual’s scientific research output[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005,102(46):16569-16572.
[20]肖可. H指数在学科研究热点分析中的应用——以图情学为例[J]. 情报杂志, 2011,30(3):69-73.
[21]陈远,丛振江. 利用H指数评测微博影响力——以新浪校园微博为例[J]. 情报科学, 2015,33(5):85-90.
[22]王杨,王非凡,张舒宜,等. 基于TF-IDF和改进BP神经网络的社交平台垃圾文本过滤[J]. 计算机系统应用, 2019,28(3):126-132.
[23]叶雪梅,毛雪岷,夏锦春,等. 文本分类TF-IDF算法的改进研究[J]. 计算机工程与应用, 2019,55(2):104-109.
[24]BRIN S, PAGE L. Reprint of: The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks, 2012,56(18):3825-3833.
|