计算机与现代化 ›› 2020, Vol. 0 ›› Issue (10): 69-75.

• 图像处理 • 上一篇    下一篇

基于SIFT和最邻近匹配的商品图像相似度算法

  

  1. (浙江理工大学信息学院,浙江杭州310018)
  • 出版日期:2020-10-14 发布日期:2020-10-14
  • 作者简介:吴迎(1994—),男,江西上饶人,硕士研究生,研究方向:计算机视觉,E-mail: 529194793@qq.com。

Product Image Similarity Algorithm Based on SIFT and Nearest Neighbor Matching

  1. (School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China)
  • Online:2020-10-14 Published:2020-10-14

摘要: 在互联网电子商务领域中,随着电商用户量的激增,各种问题不断涌现。其中,同行业的卖家抄袭复制其他店铺信息的事件也经常发生,而抄袭的图像信息相较于文字信息更难以检测出相似性,因为抄袭者往往有可能会将图像信息进行剪裁、旋转、加滤镜,或者用PS等技术进行处理,使得处理后的图像不容易检测出与原图相似。而人工比对效率低下,且成本高,这就需要一个以能快速计算出商品图像相似度的算法为基础的系统来解决这个问题。SIFT(Scale-invariant Feature Transform)描述子具有尺度不变性,能够解决传统算法对于旋转后图像相似度较低的局限性,且该描述子所描述的特征信息量大。本文在介绍传统图像哈希算法的基础上,提出使用基于SIFT描述子的近似最邻近匹配算法用于电钻商品图像相似度比较。对电钻商品原图进行剪裁、增加滤镜、增加对比度、旋转和增加水印等操作生成新的图片,将这些新的图片分别和原图进行相似度对比。实验结果表明,基于SIFT描述子的近似最邻近匹配算法与哈希算法、原始SIFT算法相比有比较好的精度,能够比较准确地识别出抄袭的图像信息。

关键词: 图像相似度, 图像哈希算法, SIFT算法, 最邻近匹配

Abstract: With the surge in the number of e-commerce users, various problems continue to emerge. Among them, the incidents of plagiarism and copying information from other stores in the same industry often occur, but plagiarized image information is more difficult to detect similarity than text information, because plagiarists often may crop, rotate or add filter to image information, in addition, they can process images with PS and other technologies, which makes it difficult to detect the similarity between the processed image and the original image. However, the manual comparison is inefficient and costly, so a system based on an algorithm that can quickly calculate the similarity of commodity images is required to solve this problem. Scale-invariant feature transform (SIFT) descriptors can solve the limitations of traditional algorithms for low similarity of rotated images. The accuracy of the algorithm SIFT is high, and it can describe rich feature information. Based on the introduction of the traditional image Hash algorithm, a similar nearest neighbor matching algorithm based on SIFT descriptor is proposed for the similarity comparison of electric drill product images. The original image is cropped, added filter, added contrast, rotated, added watermarks etc, respectively, and these processed images are all compared with the original images about similarity. The experimental results show that the similar nearest neighbor matching algorithm has better accuracy than Hash algorithm and SIFT algorithm, and it can identify plagiarized image information more accurately.

Key words: image similarity, image Hash algorithm, SIFT algorithm, nearest neighbor matching