计算机与现代化

• 应用与开发 • 上一篇    下一篇

 基于用户行为分析的在线订票系统缓存优化策略研究

  

  1. 1.北京交通大学计算机与信息技术学院交通数据分析与挖掘北京市重点实验室,北京100044;
    2.中国民航信息网络股份有限公司,北京100010
  • 收稿日期:2016-10-11 出版日期:2017-05-26 发布日期:2017-05-31
  • 作者简介: 邱鹏(1991-),男,山东潍坊人,北京交通大学计算机与信息技术学院交通数据分析与挖掘北京市重点实验室硕士研究生,研究方向:数据挖掘; 万怀宇(1981-),男,湖北宣恩人,讲师,博士,研究方向:数据挖掘; 林友芳(1971-),男,福建武平人,教授,博士,研究方向:数据与知识工程。
  • 基金资助:
     国家自然科学基金资助项目(61603028); 教育部-中国移动科研基金资助项目(MCM20150513); 中国博士后科学基金资助项目(2015M580040)

 Developing Cache Strategies for Online Ticketing Systems by Analyzing User Behaviors

  1. 1. Beijing Key Laboratory of Traffic Data Analysis and Mining, School of Computer and Information Technology, 
    Beijing Jiaotong University, Beijing 100044, China; 2. TravelSky Technology Limited, Beijing 100010, China
  • Received:2016-10-11 Online:2017-05-26 Published:2017-05-31

摘要:  随着互联网和移动终端技术的快速发展,越来越多的用户通过互联网渠道查询和订购机票。为了减轻大量用户访问给在线订票平台带来的巨大后台查询压力,对余票(即机票库存)和票价信息进行缓存逐渐成为各在线订票系统普遍采取的措施。缓存机制中的一个关键问题是如何设置查询关键字的缓存有效时间(TTL)。本文提出一种基于用户查询行为分析的缓存优化策略,通过大量用户的查询记录来挖掘机票的库存变化规律,预测库存变化时间间隔,从而动态地设置TTL。在某在线订票网站的真实查询行为数据集上进行了实验,表明本文提出的方法在保证查询结果准确率的同时,能够极大地提高缓存命中率。

关键词:  , 在线订票系统, 用户行为分析, 库存变化, 缓存有效时间

Abstract: With the rapid development of Internet and mobile terminal technology, more and more users prefer to query and book airline tickets on the Internet. In order to relive the heavy operating pressure of online ticketing systems brought by a large number of users’ queries, caching the numbers and prices of tickets (i.e. tickets inventory) has become a widely adopted mechanism by most online ticketing systems. The key issue of cache systems is how to set the time-to-live (TTL) of each query keyword. This paper proposes a cache optimization strategy based on the analysis of users’ query behaviors, which mines the inventory change patterns based on massive amounts of user query logs, and dynamically sets the TTL by forecasting inventory change time interval. We carry out experiments in a real user query dataset collected from an online ticketing site, and the experimental results demonstrate that the proposed method can greatly improve the hitting ratio of cache while ensuring the accuracy of the query results.

Key words:  online ticketing system, user behavior analysis, inventory change, time to live