Loading...

Table of Content

    28 February 2025, Volume 0 Issue 02
    An Empirical Study on the Drift Diffusion of Task Interruption in User Identification of Phishing
    WANG Le, WANG Zhiying
    2025, 0(02):  1-12.  doi:10.3969/j.issn.1006-2475.2025.02.001
    Asbtract ( 101 )   PDF (2802KB) ( 78 )  
    References | Related Articles | Metrics
    Whether users can correctly identify phishing is the last line of defense against their attacks, and frequent and unavoidable interruptions pose a serious challenge for users to quickly process large amounts of emails and identify phishing. Task interruption has been proven to have both positive and negative effects on the main task, and research conclusions are inconsistent. Therefore, the role of task interruption in identifying phishing needs further explorations. This article is based on the drift diffusion model and constructs a research model for interrupt state, behavior selection, behavior selection, and interrupt state. Bayesian estimation of model parameters is used to analyze the drift rate, boundary height, starting point deviation, and non-decision time of user identification phishing in the presence or absence of interruptions. The analysis of online experimental data found that task interruption has a double-edged sword effect on correctly identifying phishing emails for users. When there is interruption, the user’s reaction time becomes shorter and there is no significant difference in accuracy, but the drift rate decreases, leading to higher boundary heights. In addition, relevant analysis on differences in accuracy, gender, susceptibility, and knowledge and experience found that when there is no interruption, the individual is male, the higher the accuracy, the lower the susceptibility, and the lower the knowledge and experience, the drift rate is faster, and the non-decision-making time for coding is shorter. However, when there is interruption, the individual is female, the lower the accuracy, and the lower the knowledge and experience, the drift rate is faster. This study extends the perspective of third-party task interruptions beyond the subject and object to explore the environmental factors that affect user identification of phishing, provides practical guidance for improving users’ ability to recognize phishing attacks.
    Blockchain-based Distributed Contract Electricity Transfer 
    ZHANG Zihao1, 3, YE Meng2, PAN Shixian1, MA Li2, BAO Tao1, YU Qi3
    2025, 0(02):  13-18.  doi:10.3969/j.issn.1006-2475.2025.02.002
    Asbtract ( 69 )   PDF (1268KB) ( 64 )  
    References | Related Articles | Metrics
    This paper introduces a distributed contract-based electricity transfer system leveraging blockchain technology to adapt to the diversity of the electricity market and the volatility in electricity demand. The necessary contracts for electricity transfer, namely, original contracts, electricity mutual assurance agreements, and transfer transaction contracts, are categorized into three distinct types. Specific protocols are designed for the generation of each type of contract. Verification of contract authenticity is performed using ordered multi-signature schemes, ensuring the validity of the contracts. Furthermore, broadcast encryption is employed to safeguard the privacy and accuracy of contract content. All contracts are stored on the blockchain, guaranteeing their immutability. The results of experiments demonstrate that this system significantly enhances the efficiency of contract generation and verification, facilitating swift and secure electricity transfers.
    Zero-shot Learning Based on Semantic Extension and Embedding
    GUO Chenguang, MAO Jian, WANG Yunyun
    2025, 0(02):  19-27.  doi:10.3969/j.issn.1006-2475.2025.02.003
    Asbtract ( 81 )   PDF (2953KB) ( 63 )  
    References | Related Articles | Metrics
     In zero-shot image classification, semantic embedding technology (i.e., using semantic attributes to describe class labels) provides the means to generate visual features for unknown objects by transferring knowledge from known objects. Current research often utilizes class semantic attributes as auxiliary information for describing class visual features. However, class semantic attributes are typically obtained through external paradigms such as manual annotation, resulting in weak consistency with visual features. Moreover, a single class semantic attribute is insufficient to capture the diversity of visual features. To enhance the diversity of class semantic attributes and their capacity to describe visual features, this paper introduces a Semantic Extension and Embedding for Zero-Shot Learning (SeeZSL) based on semantic extension and embedding. SeeZSL expands semantic information by constructing a latent semantic space for each class, enabling the generation of visual features for unknown classes using this semantic space. Additionally, to address the issues of weak consistency and the lack of discriminative ability between the original feature space and class semantic attributes, a semantic extension-based generation model is integrated with an contrastive-embedding model. The effectiveness of the proposed SeeZSL method was experimentally validated on four benchmark datasets.
    Resampling of Imbalanced Data for Optimizing Downstream Tasks 
    GUO Hua
    2025, 0(02):  28-32.  doi:10.3969/j.issn.1006-2475.2025.02.004
    Asbtract ( 60 )   PDF (2178KB) ( 51 )  
    References | Related Articles | Metrics
     Data resampling is a key method for correcting imbalanced dataset. Traditional methods construct balanced samples by minimizing geometric errors in the sample space, but they perform poorly in high-dimensional space with complex distribution patterns. Moreover, relying on statistical features lacks specificity for downstream tasks. To address this issue, this paper presents Sampling for Optimizing Downstream Neural Network (SOD-NN), a neural network for data sampling. This approach utilizes the ability of neural networks for nonlinear processing to identify the distribution characteristics of high-dimensional samples. It combines with downstream tasks to create a two-stage network, enabling overall optimization, thereby enhancing the model’s capability to meet the requirements of downstream tasks effectively. Specifically, the dataset is first divided spatially during sampling. Residual processing of sample subsets is then applied to prevent data degradation. Subsequently, a self-attention mechanism is utilized to construct global feature, ensuring consistency with the original sample distribution. Experimental results indicate that the model proposed in this paper significantly improves the recognition performance of minority class samples in downstream classification tasks, enhancing the robustness of processing these tasks.
    Survey on Intelligent Optimization Algorithm for Feature Selection
    QI Haochun
    2025, 0(02):  33-43.  doi:10.3969/j.issn.1006-2475.2025.02.05
    Asbtract ( 89 )   PDF (985KB) ( 62 )  
    References | Related Articles | Metrics
     Feature selection, as one of the main techniques in data preprocessing, can effectively identify key features, thereby reducing dimensionality and effectively addressing the issue of “curse of dimensionality”. Feature selection is a typical NP-hard problem, and intelligent optimization algorithm have been widely employed in feature selection due to their remarkable global search ability. Firstly, this paper summarizes methods for evaluating feature importance and parameters updating. The former is used for evaluating the relevance and redundancy of features, while the latter is used for updating algorithm parameters. These two methodologies are both applicable to various crucial steps of intelligent optimization algorithm for feature selection. Then, the strategic design of three core steps in the process, namely algorithm initialization, population search, and objective function design, is introduced. The initialization strategy is summarized from the perspectives of decision space initialization and population initialization, with an analysis of the advantages and limitations of different strategies. Based on the population quantity, a detailed classification of search strategies for single population and multiple population is provided. According to the different metrics applied in the objective function, a categorization of objective function design can be summarized. Finally, it discusses future work for intelligent optimization algorithm to feature selection.
    Adaptive Low-power Localization Scheme for Pedestrians Based on LSTM Scene Classification
    YU Qing1, JIANG Jinguang1, 2, 3, XIE Dongpeng1, LIU Jianghua4
    2025, 0(02):  44-51.  doi:10.3969/j.issn.1006-2475.2025.02.006
    Asbtract ( 64 )   PDF (2986KB) ( 64 )  
    References | Related Articles | Metrics
     To address the challenges of pedestrian localization accuracy and high power consumption in outdoor complex environments, this paper proposes a low-power localization scheme based on scene classification for foot-mounted pedestrian navigation systems using GNSS/INS technology. This scheme collects GNSS, temperature and humidity sensor data, uses LSTM to classify typical outdoor scenes and adjusts the clock frequency of the MCU according to different scenes. Additionally, the scheme proposes an improved Sage-Husa method to mitigate the impact of GNSS outliers on localization results. The experimental results demonstrate that this solution achieves a scene classification accuracy of 97.64% with a system power consumption of only 193.074 mW. Compared with traditional ZUPT, GNSS, GNSS/INS integration and Sage-Husa methods, the proposed scheme reduces the root mean square localization error by 83.15%, 42.88%, 21.91% and 11.49% respectively. Therefore, this scheme can improve pedestrian localization accuracy in outdoor environments with low system power consumption.
    A Triple Joint Extraction Model for Talent Resume Information
    SHEN Xinke1, 2, LI Yong1, 2, WEN Ming2, REN Yuanyuan2
    2025, 0(02):  52-57.  doi:10.3969/j.issn.1006-2475.2025.02.007
    Asbtract ( 51 )   PDF (1346KB) ( 37 )  
    References | Related Articles | Metrics
    The field of talent title evaluation contains a large amount of talent resume information, but resume information often exists in the form of natural language, which experts find difficult to use as a basis for talent title evaluation. To address this issue, this article combines entity extraction and relationship extraction for joint modeling, and constructs a triplet joint extraction model (RLAC) for talent resume information. Firstly, the Chinese pre-trained language model RoBERT-wwm is used to encode the underlying talent resume information. Secondly, the introduction of LSTM network and attention mechanism improves the problem of difficult recognition of head entities in talent resume information, and enhances the ability to extract semantic features in coding context. Thirdly, input the encoded information into the header entity annotator to obtain the header entity. Finally, concatenate the head entity and talent resume information and input them into the tail entity relationship annotator to alleviate the problem of relationship overlap, thus obtaining a triplet. Compared with the baseline model, the experimental results on the talent resume dataset of the proposed model has improved accuracy, recall, and F1 value, indicating that the model has good triplet extraction ability.
    Anomaly Detection Algorithm Based on Bidirectional Multi-scale Knowledge Distillation
    LIU Chongyi, LI Hua, REN Dejun, LIU Yaokai, WANG Yulong
    2025, 0(02):  58-63.  doi:10.3969/j.issn.1006-2475.2025.02.008
    Asbtract ( 76 )   PDF (3688KB) ( 82 )  
    References | Related Articles | Metrics

    Aiming at the problem of low anomaly detection and localization accuracy in current knowledge distillation-based anomaly detection algorithms due to the low difference in abnormal feature representation between teacher and student models, an anomaly detection algorithm based on bidirectional multi-scale knowledge distillation is proposed. An asymmetric teacher-student network structure composed of a teacher model, a student model and a reverse distillation student model is employed to suppress the student’s generalization to abnormal features. A feature fusion residual module is introduced between the bidirectional distillation student models to integrate multi-scale features and reduce abnormal disturbances. An attention module is introduced within the forward distillation student model to enhance the learning ability of important features. During the testing phase, anomaly assessment is performed through multi-scale anomaly map fusion. Experimental results on the public dataset MVTec AD show that the proposed algorithm, using ResNet18 as the backbone, achieves high scores of 97.7% at the pixel level and 98.8% at the image level on the area under the receiver operating characteristic curve evaluation metric, effectively improving the current knowledge distillation algorithms.

    A Novel Communication Data Fusion Method for Power Systems Based on Multimodal Graph Convolutional Networks
    LI Ang1, DU Mengjun1, QIAN Jin1, TONG Jun1, YANG Tao1 CHEN Guotao1, JIN Wenxing2
    2025, 0(02):  64-69.  doi:10.3969/j.issn.1006-2475.2025.02.009
    Asbtract ( 59 )   PDF (2546KB) ( 60 )  
    References | Related Articles | Metrics
     The integration of a large number of devices in the new power system has brought about the problem of chaotic and difficult to handle communication data between devices. This article adopts a multimodal graph convolutional network to fuse communication data of a new type of power system. Firstly, by classifying the data source devices, a node equation for communication data flow is constructed. Secondly, based on the process of data transmission, multi-modal methods are used to construct fully linked data edges. Finally, the graph convolution method is used to convolution and fuse the obtained communication data stream, simplifying the data transmission process into data vectors, completing the feature level data fusion process, and guiding decision-making. Through simulation testing on the communication dataset of Zhejiang Power Grid, it is verified that the new power system communication data fusion method based on multimodal graph convolutional network has good application effects.
    Improved LEACH Algorithm Based on Dynamic Energy Threshold
    GU Yi, NI Xiaojun
    2025, 0(02):  70-76.  doi:10.3969/j.issn.1006-2475.2025.02.010
    Asbtract ( 53 )   PDF (1555KB) ( 38 )  
    References | Related Articles | Metrics
    Aiming at the problems of unreasonable cluster head distribution and fast energy consumption of some nodes in the LEACH protocol, an improved algorithm LEACH-ECP is proposed based on the IMPROVED-LEACH protocol. In the clustering stage, a predicted energy consumption is set for the node, taking into account three influencing factors: energy, density, and distance. Based on this predicted energy consumption, the remaining energy of the node after being selected as the cluster head is calculated. Then, dynamic energy thresholds are given based on the predicted values of the remaining energy of all nodes. The cluster head selection mechanism and member node entry mechanism are modified to extend the network lifecycle and reduce energy consumption. This article compares the LEACH-ECP protocol with LEACH and LEACH-IMPROVED algorithms. Simulation experiment results show that the appearance time of the first dead node in LEACH-ECP is 71% longer than LEACH and 13% longer than LEACH-IMPROVED. The LEACH-ECP algorithm can select cluster heads more reasonably and prolong the network lifecycle.
    Optimization for Camera Self-calibration Based on Horizon Detection in Road Scenes
    HE Guotao1, ZHAO Chunhui2, LIU Zhenyu1, WANG Long1
    2025, 0(02):  77-85.  doi:10.3969/j.issn.1006-2475.2025.02.011
    Asbtract ( 61 )   PDF (3553KB) ( 54 )  
    References | Related Articles | Metrics
    The current camera calibration in traffic scenes relies mainly on the key information of the road scene and relies on redundant information such as road dashed lines, parallel lines, etc. to optimize the camera calibration parameters. However, due to the limited information of the scene, the range of vanishing points cannot be fixed, and at the same time, due to the existence of the camera spin angle, the results of the camera calibration have a certain degree of error. Starting from the horizon detection, a deep learning key point detection-based horizon detection algorithm improves the accuracy to 82.46%. Subsequently, the camera self-calibration parameters are optimized by correcting the camera spin angle based on horizon detection and providing stricter constraints by using the horizon. The experimental results show that after correcting the camera spin angle and providing stronger constraints by using the horizon, the camera self-calibration parameters obtain faster convergence and a minimum of 1.79% error.
    Twin Feature Fusion Network for Scene Text Image Super Resolution
    FENG Xinjie, WANG Wei
    2025, 0(02):  86-93.  doi:10.3969/j.issn.1006-2475.2025.02.012
    Asbtract ( 80 )   PDF (2309KB) ( 72 )  
    References | Related Articles | Metrics
    The aim of the scene text image super-resolution (STISR) method is to enhance the resolution and legibility of text images, thereby improving the performance of downstream text recognition tasks. Previous studies have shown that the introduction of text-prior information can better guide the super-resolution. However, these methods have not effectively utilized text-prior information and have not fully integrated it with image features, limiting super-resolution task performance. In this paper, we propose a Twin Feature Fusion Network (TFFN) to address this problem. The method aims to maximize the utilization of text-prior information from pre-trained text recognizers, with a focus on the recovery of text area content. Firstly, text-prior information is extracted using a text recognition network. Next, a twin feature fusion module is constructed, which employs a twin attention mechanism to facilitate bidirectional interaction between image features and text-prior information. The fusion module further integrates context-enhanced image features and text-prior information. Finally, sequence features are extracted to reconstruct the text image. Experiments on the benchmark TextZoom dataset show that the proposed TFFN improves the recognition accuracy of the ASTER, MORAN, and CRNN text recognition networks by 0.22~0.5, 0.6~1.1 and 0.33~1.1 percentage points, respectively.
    Improved Traffic Sign Detection Algorithm of YOLOv7
    ZHAO Yin, YIN Siqing, ZHANG Yonglai
    2025, 0(02):  94-99.  doi:10.3969/j.issn.1006-2475.2025.02.013
    Asbtract ( 83 )   PDF (1110KB) ( 78 )  
    References | Related Articles | Metrics
    In view of the problems such as error detection and missing detection in the small pixel proportion of traffic signs, a traffic sign detection algorithm based on improved YOLOv7 is proposed. In YOLOv7, we introduce the small target detection layer and delete the large target detection layer to better meet the detection needs of small targets. In the backbone network, we introduce the EMA attention mechanism to improve the feature extraction capability of the model for multi-scale targets with reduced computational overhead. At the same time, ELAN-RPC module is constructed to replace the original ELAN, reduce the network calculation and improve the network reasoning speed. In addition, RFE module is introduced in the feature fusion layer to make better use of the details of the shallow feature map and improve the ability of subsequent top-down feature fusion. Experimental results show that the mAP of the improved YOLOv7 on TT100K dataset reaches 89.6%, which is 5.8 percentage points higher than that of the original algorithm, while the number of parameters is reduced by 37%, achieving the detection effect of fewer parameters and higher precision.
    Video Rain Removal Algorithm Based on Deep Learning
    YAN Qiang1, SHEN Shouting2, BAI Junqing2, CHENG Guojian2
    2025, 0(02):  100-107.  doi:10.3969/j.issn.1006-2475.2025.02.014
    Asbtract ( 57 )   PDF (6182KB) ( 55 )  
    References | Related Articles | Metrics
    In view of the fact that most traditional video rain removal algorithms only focus on removing rain marks and are trained only on synthetic data, ignoring more complex degradation factors such as rain accumulation, occlusion, and prior knowledge in real data. In this paper, we propose a two-stage video deraining algorithm that combines synthetic and real videos. The first stage algorithm performs a reverse recovery process under the guidance of the proposed rain removal model Initial-DerainNet. Continuous rain frames containing degradation factors are input into the network and physical prior knowledge is integrated to obtain an initial estimated rain-free frame. The second stage uses adversarial learning to refine the results, that is, to restore the overall color, illumination distribution, etc. of the initially estimated rain-free frame to obtain a more accurate rain-free frame. Experimental results show that the PSNR value of this algorithm reaches 35.22 dB and the SSIM value reaches 0.9596 on the synthetic rain removal data set RainSyntheticDataset100, which is better than benchmark rain removal algorithms such as JORDER, DetailNet, SpacNN, SE, J4Rnet and FastDeRain. On the real rain video test set, the algorithm in this paper can achieve PNSR values of more than 30 dB on rain videos of different dimensions, which is better than other rain removal algorithms in terms of subjective visual effect and data metrics, and can effectively improve the quality of rainy day videos.
    A New Method of Pavement Disease Detection Based on Improved YOLOv8
    HE Feixiong1, XIE Haiwei1, PU Chao2, ZOU Chuanming2, JIA Yixuan1
    2025, 0(02):  108-113.  doi:10.3969/j.issn.1006-2475.2025.02.015
    Asbtract ( 97 )   PDF (2770KB) ( 82 )  
    References | Related Articles | Metrics
    As the operating time of a road increasing, the repeated effects of traveling loads and natural factors lead to deterioration of the road condition, and impacting its service life and quality. Therefore, In this paper, an improved YOLOv8 network is proposed for pavement disease detection. Firstly, targeted data enhancement techniques such as image flipping, lighting conditions change, and motion blur operation are applied, considering the characteristics of road disease images. Secondly, the loss function Wise-IoU is employed, which adopts a dynamic nonlinear focusing mechanism to evaluate the quality of the anchor box with outliers instead of IoU, and the wise gradient gain allocation strategy is provided to balance the differences in the number of samples among disease categories and improve the overall performance of the detector. Additionally, part of the C2F modules are replaced with DCNv3, and convolutional neuron weights are shared to reduce computational complexity and better learn features in pavement disease images. At the same time, multiple mechanisms are introduced, Softmax normalization along the sampling points enhances the model’s ability to understand road disease images. The experimental results show that the improved YOLOv8 road disease detection algorithm can achieve an accuracy of 77.3% in testing the network model, which is 3.9 percentage points higher than YOLOv8. mAP@50 reaches 76.9%, which is 3.4 percentage points higher than YOLOv8. This model can detect road diseases accurately and precisely, which is superior to the existing road disease detection algorithms and can applicate in engineering. 
    Identification of Typical Defects in Key Components of Overhead Lines Based on Improved YOLOv5
    WANG Peng1, NI Bin1, GUO Zhuangzhuang1, ZHANG Shusheng1, WANG Zhi1, CAI Runkai2
    2025, 0(02):  114-120.  doi:10.3969/j.issn.1006-2475.2025.02.016
    Asbtract ( 46 )   PDF (3175KB) ( 48 )  
    References | Related Articles | Metrics
    The key components in overhead lines may suffer from damage, detachment, and other defects due to long-term exposure to the natural environment. It is difficult to detect and repair these defects manually. To address the aforementioned issues, this paper proposes an light-weighted, edge computing device suited, improved YOLOv5 based detection method. Firstly, an EMA module is added at the end of the backbone network to enhance the network’s ability to capture features. Secondly, the CBS module of the neck will be replaced with GhostConv, and the C3 module of the neck will be combined with SENetV2 to make the network more lightweight while enhancing its representational ability. The experimental results demonstrate that the proposed method achieves a significant improvement in class-average accuracy compared to YOLOv5, while maintaining real-time detection capability with only marginal frame rate reduction. Compared with SSD and Faster R-CNN algorithms, it has certain advantages in detection accuracy and speed.
    Real-time Semantic Segmentation Based on Gate-controlled Fusion
    FENG Zuyan, WEI Yan, CHEN Jiakun, YU Xin
    2025, 0(02):  121-126.  doi:10.3969/j.issn.1006-2475.2025.02.017
    Asbtract ( 46 )   PDF (2056KB) ( 54 )  
    References | Related Articles | Metrics
    Feature fusion in real-time semantic segmentation needs to pay attention to both shallow and deep information, while the current feature fusion methods require a huge amount of computation and parameter count, which is difficult to meet the requirements of real-time semantic segmentation in terms of accuracy and speed. To address this problem, a real-time semantic segmentation method based on gated fusion is proposed from the comprehensive consideration of both real-time and performance of the network. The method contains an encoder, a gated feature fusion module, a pixel-level feature extraction module, and a gated aggregation segmentation head. Firstly, the image to be segmented is feature extracted by the encoder. Secondly, the important feature information is accurately extracted by the pixel-level feature extraction module, then the deep semantic information and the shallow location information are feature fused by the gated feature fusion module. Finally the semantic segmentation is completed by the gated aggregation segmentation head. On the dataset CamVid, the mean intersection over union of the model segmentation is 87.31%, and the frame rate of segmentation is 75.3 fps. On the dataset Cityscapes, the mean intersection over union of the model segmentation is 79.19%, and the frame rate of segmentation is 44.1 fps. Experimental results show that the proposed segmentation method performs well in both accuracy and real-time, and it can be effectively applied to real-time semantic segmentation tasks.