计算机与现代化

• 算法设计与分析 • 上一篇    下一篇

面向MOOC的垂直搜索引擎

  

  1. 湖北师范大学教育信息与技术学院,湖北黄石435002
  • 收稿日期:2016-07-22 出版日期:2017-04-20 发布日期:2017-05-08
  • 作者简介:李全(1982-),男,湖北黄陂人,湖北师范大学教育信息与技术学院讲师,硕士,研究方向:信息检索,数据挖掘; 林松(1978-),男,湖北黄石人,讲师,硕士,研究方向:大数据,数据挖掘; 田俊(1981-),女,湖北公安人,副教授,硕士,研究方向:大数据,教育信息化; 刘兴红(1969-),女,湖北蕲春人,教授,硕士,研究方向:大数据,教育信息化。
  • 基金资助:
    湖北省教育科学“十二五”规划项目(2011B130); 湖北省高等学校优秀中青年科技创新团队计划项目(T201515); 湖北省教学研究项目(2015382)

Vertical Search Engine for MOOC

  1. College of Educational Information and Technology, Hubei Normal University, Huangshi 435002, China
  • Received:2016-07-22 Online:2017-04-20 Published:2017-05-08

摘要: 近年来随着大型网络开放平台MOOC的大量出现,学习者需要花费大量的时间在不同的平台搜索自己满意的MOOC课程。为了提高MOOC教育资源的利用率,本文设计并实现面向MOOC领域的垂直搜索引擎系统,提出一种多线程并行紧耦合爬取和索引优化方案;根据课程列表的3种加载方法,实现课程相关信息的下载;分析被提取课程网页的特征定制相关信息抽取规则;提出一种检索排序相似度评分的优化方法。实验结果表明:该垂直搜索引擎在平均爬取及索引时间、排序效果和平均正确率均值等方面都有一定的提高,实现了MOOC教育资源的整合、存储和检索功能,满足了教育信息化发展的要求。

关键词: 大规模开放式在线课程, 教育资源, 搜索引擎, 相似度评分

Abstract: Learners need spend a lot of time in searching the satisfying courses of themselves in different platforms, as the large network open platform MOOC appears in recent years. A vertical search engine for MOOC is designed and implemented in order to improve the utilization efficiency of education resource in MOOC. This paper proposes a kind of optimization scheme of tightly coupled crawling and index of multithreading parallel. It can download the relative information of courses according to three kinds of methods of loading course list, and customize the extraction rules of relative information according to the feature of course Web page being analyzed. This paper also proposes a kind of prioritization method of similarity score in search ranking. The analysis on experiment result indicates that the evaluation values of average time of crawling and index, sorting effect and average mean value of correct rate etc increase to some extend by the vertical search engine for MOOC. Therefore, it achieves the integration, storage, and retrieval functions of education resource in MOOC, and satisfies the requirements of development of educational information.

Key words: massive open online course(MOOC), education resource, search engine, similarity score

中图分类号: