计算机与现代化 ›› 2021, Vol. 0 ›› Issue (10): 35-40.

• 软件工程 • 上一篇    下一篇

基于排序学习算法的智能检索系统

  

  1. (华北计算技术研究所系统八部,北京100083)
  • 出版日期:2021-10-14 发布日期:2021-10-14
  • 作者简介:王镇宇(1995—),男,江苏南通人,硕士研究生,研究方向:信息检索,搜索引擎,E-mail: shanewongms@outlook.com; 郑扬飞(1976—),男,浙江丽水人,研究员级高级工程师,博士,研究方向:信息化工程,信息系统体系结构,E-mail: zhengyangfei@163.com。
  • 基金资助:
    科技创新2030—“新一代人工智能”重大项目(2020AAA0105100)

An Intelligent Information Retrieval System Based on Ranking Learning Algorithm

  1. (Department 8 of System, North China Institute of Computing Technology, Beijing 100083, China)
  • Online:2021-10-14 Published:2021-10-14

摘要: 本文旨在解决数据资产管理系统中信息检索效率低、检索结果准确率低下的痛点,基于排序学习算法构建智能检索系统,提升检索结果和用户请求的相关性。对排序学习算法理论进行研究,对常用的排序学习算法进行相关优化,将分类问题扩展到文本排序问题之上,定义相关的目标函数及损失函数,使用机器学习的方法来提升检索结果的准确度。基于垂直分布式搜索引擎技术及排序学习算法构建智能检索系统,通过相关性工程提升检索请求转化的效率。实验表明本系统可以在优化检索速率的基础之上,提升检索语句与返回结果之间的相关性和检索的准确度。

关键词: 计算机应用, 信息检索, 相关性搜索, 学习排名, Elasticsearch

Abstract: This paper aims to solve the pain points of low information retrieval efficiency and low accuracy of retrieval results in the data asset management system, and integrates an intelligent retrieval system based on the ranking learning algorithm to improve the relevance of retrieval results and user requests. The theory of ranking learning algorithm is studied, the commonly used ranking learning algorithms are optimized, the classification problem is extended to the text ranking problem, the related objective function and loss function are defined, and the machine learning method is used to improve the accuracy of the retrieval results. The intelligent retrieval system built in vertical distributed search engine technology and ranking learning algorithm improves the efficiency of retrieval request conversion through correlation engineering. Experiments show that this system can enhance the relevance between retrieval sentences and returned results on the basis of optimizing retrieval rate and polish up the accuracy of retrieval.

Key words: computer application, information retrieval, relevant search, learning to rank, Elasticsearch