计算机与现代化

• 人工智能 • 上一篇    下一篇

一种基于区间预留编码的XML关键字查询算法

  

  1. (中国石油大学(华东)计算机与通信工程学院,山东青岛266000)
  • 收稿日期:2019-03-21 出版日期:2019-10-28 发布日期:2019-10-29
  • 作者简介:魏东平(1965-),男,山东潍坊人,副教授,硕士,研究方向:数据库及信息系统,XML数据查询,E-mail: weidp@upc.edu.cn; 罗丹(1993-),女,硕士研究生,研究方向:数据库与信息系统,E-mail: 1337140974@qq.com。

An XML Keyword Query Algorithm Based on Interval Reserved Coding

  1. (College of Computer and Communication Engineering, China University of Petroleum, Qingdao 266000, China)
  • Received:2019-03-21 Online:2019-10-28 Published:2019-10-29

摘要: 近年来,随着XML数据的爆炸式增长,对XML关键字查询技术的研究日益受到关注。数据编码是关键字查询的基础,目前主要有2种方式——基于路径的编码及区间编码。区间编码可更好地适应对查询中的XML数据进行动态的更新,因而具有更多的优势。本文研究基于区间编码的关键字查询问题,提出一种新的查询算法。该算法首先根据预留的区间值建立索引,再根据最小范围值对索引进行选择遍历,减少了不必要的比较,达到了提高查询效率的目的。研究发现,预留空间的选择对查询效率有一定的影响。为此,本文设计一种基于节点自身进行区间预留的编码方式(Interval Reservation Based on Node, IRBN),为节点设置权值,并根据权值进行区间值的设定,形成根据节点自身分配区间的较为均衡的编码。实验表明,IRBN编码是合理的,有较高的查询效率。

关键词: XML, 关键字查询, 区间预留, IRBN

Abstract: In recent years, with the explosive growth of XML data, research on XML keyword query technology has received increasing attention. Data coding is the basis of keyword query. There are two main ways at present: path-based coding and interval coding. Interval coding can better adapt to the dynamic updating of XML data in queries, and thus has more advantages. This paper studies the keyword query problem based on interval coding and puts forward a new query algorithm. The algorithm firstly establishes an index according to the reserved interval value, and then selects and traverses the index according to the minimum range value, thereby reducing unnecessary comparison and achieving the purpose of improving query efficiency. The study found that the choice of reserved space has a certain impact on query efficiency. This paper designs an Interval Reservation Based on Node (IRBN) based on the node itself, sets the weight for the node, and sets the interval value according to the weight, forming an interval more balanced coding method according to the node’s own allocation. Experiments show that IRBN coding is reasonable and has high query efficiency.

Key words: XML, keyword query, interval reservation, IRBN

中图分类号: