[1] 中国互联网络中心(CNNIC). 第33次中国互联网络发展状况统计报告[EB/OL]. http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201403/t20140305_46240.htm, 2014-03-05.
[2] Fiol-Roig G, Miró-Julià M, Herraiz E. Data mining techniques for Web page classification[J]. Highlights in Practical Applications of Agents and Multiagent Systems, 2011,89:61-68.
[3] 〖JP2〗Baykan E, Henzinger M, Marian L, et al. A comprehensive study of features and algorithms for URL-based topic classification[J]. ACM Transactions on the Web (TWEB), 2011,5(3):No 15.
[4] Sriurai W, Meesad P, Haruechaiyasak C. Improving Web page classification by integrating neighboring pages via a topic model[C]// Proceedings of IICS, 2010. 2010:238-246.
[5] Qi X, Davison B D. Classifiers without borders: Incorporating fielded text from neighboring Web pages[C]// Proceedings of the 31st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. 2008:643-650.
[6] Croft W B, Metzler D, Strohman T. Search engines: Information Retrieval in Practice [M]. Addison-Wesley, 2010:351-358.
[7] 姚旭,王晓丹,张玉玺,等. 特征选择方法综述[J]. 控制与决策, 2012, 27(2):161-166,192.
[8] Issac B, Jap W J. Implementing spam detection using Bayesian and Porter Stemmer keyword stripping approaches[C]// IEEE Region 10 Conference on TENCON 2009-2009. 2009:1-5.
[9] 施聪莺,徐朝军,杨晓江. TFIDF 算法研究综述[J]. 计算机应用, 2009,29(S1):167-170,180.
[10]任永功,杨荣杰,尹明飞,等. 基于信息增益的文本特征选择方法[J]. 计算机科学, 2012,39(11):127-130.
[11]AOL Inc.. The Open Directory Project(ODP)[EB/OL]. http://www.dmoz.org/, 2013-03-01.
[12]Jagarlamudi J, Bennett P N, Svore K M. Leveraging interlingual classification to improve Web search[C]// Proceedings of the 21st International Conference Companion on World Wide Web. 2012:535-536.
[13]何忠秀,王霜,安礼成. 基于向量空间的网页内容相似度计算方法研究[J]. 计算机与现代化, 2010(9):53-55,58.
[14]郭庆琳, 李艳梅, 唐琦. 基于 VSM 的文本相似度计算的研究[J]. 计算机应用研究, 2008, 25(11):3256-3258.
[15]Menon A K. Large-Scale Support Vector Machines: Algorithms and Theory[R]. Research Exam, University of California, San Diego, 2009.
[16]Chang C C, Lin C J. LIBSVM: A library for support vector machines [J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2011,2(3):27-53.
[17]奉国和. 文本分类性能评价研究[J]. 情报杂志, 2011, 30(8):66-70. |