计算机与现代化 ›› 2011, Vol. 1 ›› Issue (8): 39-41.doi: 10.3969/j.issn.1006-2475.2011.08.011

• 软件工程 • 上一篇    下一篇

HTML到XML转换研究

钱 程,阳小兰   

  1. 武汉科技大学中南分校信息工程学院,湖北 武汉 430223
  • 收稿日期:2011-04-07 修回日期:1900-01-01 出版日期:2011-08-10 发布日期:2011-08-10

Research on Transforming HTML into XML

QIAN Cheng, YANG Xiao-lan   

  1. College of Information Engineering, Zhongnan Branch, Wuhan University of Science and Technology, Wuhan 430223, China
  • Received:2011-04-07 Revised:1900-01-01 Online:2011-08-10 Published:2011-08-10

摘要: 网络上的许多信息都是由HTML编写的,但HTML语言本身具有不足,使得其不能处理网络上的许多需求,而XML可以弥补很多HTML的不足,因此网络应用的传统数据和XML标记数据的转换变得日趋重要。本文对从HTML到XML的转换技术进行研究,并用Java语言实现该转换系统。

关键词: HTML, XML, 解析器, 信息抽取, JAXB

Abstract: Most of the information on the network is programmed in HTML, but the HTML language itself has shortages, so it can not deal with many demands on the network. XML can make up for the lack of HTML, therefore, traditional data network applications and transform XML markup data is becoming increasingly important. In this paper, the conversion from HTML to XML technologies is researched, and the conversion system is implemented in Java language.

Key words: HTML, XML, parser, information extraction, JAXB

中图分类号: