Most Down Articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    In last 2 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    A Fast Registration Method for Massive Point Clouds Based on 3D-SIFT and 4PCS
    LI Jia-le1, LI Zhe-run1, ZHAO Yong2, ZHANG Yang1
    Computer and Modernization    2024, 0 (02): 1-6.   DOI: 10.3969/j.issn.1006-2475.2024.02.001
    Abstract292)      PDF(pc) (1952KB)(541)       Save
    Abstract: The registration of measurement point cloud and model point cloud is the key of visual positioning. Aiming at the problems of poor visual positioning accuracy and low algorithm efficiency caused by large amount of measurement point cloud data and low overlap rate with CAD model point cloud, a registration method of measurement point cloud and model point cloud based on the fusion of 3D scale invariant feature transform (3D-SIFT) and four point fast robust matching algorithm (4PCS) is proposed. Firstly, the depth camera is used to extract the point cloud of the part, and the extracted measurement point cloud is denoised and filtered; Then 3D-SIFT feature point extraction algorithm is used to extract feature points from measurement point cloud and CAD model point cloud; Finally, the extracted feature points are used as the initial values of the 4PCS algorithm to achieve the registration of the two point cloud data. Compared with the commonly used 4PCS algorithm and Super-4PCS algorithm, the algorithm simulation and experimental results show that the proposed algorithm can improve the registration speed by more than 30% on the premise of ensuring the registration accuracy.
    Reference | Related Articles | Metrics | Comments0
    Overview of Data Processing Techniques for MIoT Based on Fog Computing
    HAN Kun, WANG Zheng, DUAN Jun-yong, YANG Hua-lin
    Computer and Modernization    2024, 0 (01): 13-20.   DOI: 10.3969/j.issn.1006-2475.2024.01.003
    Abstract106)      PDF(pc) (1208KB)(418)       Save
    Abstract:Manufacturing Internet of Things (MIoT) is a kind of technology that combines manufacturing production system with Internet connection. Data processing plays a crucial role in MIoT. With the continuous expansion of manufacturing scale, traditional cloud computing has gradually failed to meet the needs of data processing, while the development of fog computing can effectively reduce decision delay and improve system efficiency. This paper summarizes the MIoT data processing technology based on fog computing. Firstly, the generation and characteristics of MIoT data are introduced, as well as the challenges to be faced in the data processing process. Secondly, the MIoT data processing architecture based on fog computation is introduced. Then, the key techniques of data processing in fog computation are introduced. Finally, it introduces the challenges to be faced in the deployment of the architecture and the future direction of fog computing in MIoT.
    Reference | Related Articles | Metrics | Comments0
    Named Entity Recognition in Electronic Medical Record Based on BERT
    ZHENG Li-rui, XIAO Xiao-xia, ZOU Bei-ji, LIU Bin, ZHOU Zhan
    Computer and Modernization    2024, 0 (01): 87-91.   DOI: 10.3969/j.issn.1006-2475.2024.01.014
    Abstract239)      PDF(pc) (992KB)(396)       Save
    Abstract:Electronic medical record is an important resource for the preservation, management and transmission of patients’medical records. It is also an important text record for doctors’ diagnosis and treatment of diseases. Through the electronic medical record named entity recognition (NER) technology, diagnosis and treatment information such as symptoms, diseases and drug names can be extracted from the electronic medical record efficiently and intelligently. It is helpful for structured electronic medical records to use machine learning and other technologies for diagnosis and treatment regularity mining. In order to efficiently identify named entities in electronic medical records, a named entity recognition method based on BERT and bidirectional long short-term memory network (BILSTM) with fusion adversarial training (FGM) is proposed, referred to as BERT-BILSTM-CRF-FGM (BBCF). After preprocessing by correcting the Chinese electronic medical record corpus provided by the 2017 National Knowledge Graph and Semantic Computing Conference (CCKS2017), the BERT-BILSTM-CRF-FGM model is used to recognize five types of entities in the corpus, with an average F1 score of 92.84%. Compared to the BERT model based on the inflated convolutional neural network (BERT-IDCNN-CRF) and the conditional random field model based on BILSTM (BILSTM-CRF), the proposed method has higher F1 score and faster convergence speed, which can more efficiently structure electronic medical record text.
    Reference | Related Articles | Metrics | Comments0
    Intelligent Identification Method of Debris Flow Scene Based on Camera Video Surveillance
    HU Mei-chen1, 2, LIU Dun-long1, 2, SANG Xue-jia1, 2, ZHANG Shao-jie3, CHEN Qiao4
    Computer and Modernization    2024, 0 (03): 41-46.   DOI: 10.3969/j.issn.1006-2475.2024.03.007
    Abstract190)      PDF(pc) (2573KB)(395)       Save
    Abstract: Camera video surveillance is widely used in debris flow disaster prevention and mitigation, but the existing video detection technology has limited functions and can not automatically judge the occurrence of debris flow disaster events. To solve this problem, using transfer learning strategy, this paper improves a video classification method based on convolutional neural network. Firstly, with the help of TSN model framework, the underlying network architecture is changed to ResNet-50, which is utilized for motion feature extraction and debris flow scene identification. Then, the model is pre-trained with ImageNet and Kinetics 400 datasets to make the model have strong generalization ability. Finally, the model is trained and fine-tuned with the pre-processed geological disaster video dataset, so that it can accurately identify debris flow events. The model is tested by a large number of moving scene videos, and the experimental results show that the identification accuracy of the method for debris flow movement video can reach 87.73%. Therefore, the research results of this paper can to the play a full role of video surveillance in debris flow monitoring and warning.

    Reference | Related Articles | Metrics | Comments0
    Improved Algorithm for Keypoints Detection of Hip Based on U-Net
    CHEN Zhen1, YAO Jing-hui2, SU Cheng-yue1
    Computer and Modernization    2024, 0 (02): 15-19.   DOI: 10.3969/j.issn.1006-2475.2024.02.003
    Abstract177)      PDF(pc) (1367KB)(378)       Save
    Abstract: The diagnosis of developmental dysplasia of the hip (DDH) using pelvic X-ray requires accurate mapping of hip key points, and deep learning methods can be used as reliable auxiliary tools. In order to solve the problem of diversified shooting posture and shooting distance for pelvic radiographs, this paper proposed RKD-UNet based on U-Net to detect keypoints of the hip. The model used residual blocks to improve U-Net’s convolution layers and skip-connection paths, as well as introduced the coordinate attention module into the encoder to enhance feature extraction ability for the keypoints neighborhood. Convolution layers and ASPP module were used on top of the encoder to form a Bridge block to fuse feature information at different scales and enhance the receptive field of the model with an atrous rate of [3, 6, 9]. The model was trained and tested using radiographic data containing types of pelvic orthostasis, frog, full-length lower extremity, and postoperative pelvis. RKD-UNet achieves an average keypoints detection error of 3.19 ± 2.19 px and an average acetabular angle measurement error of 2.83°± 2.59°. The F1 score for the normal, mild, moderate, and severe dislocation cases were 89.6, 77.1, 57.9, and 94.1, respectively, which were higher than the doctors’ diagnostic results. Experiments have shown that RKD-UNet can accurately detect keypoints of the hip and assist doctors in diagnosing DDH.
    Reference | Related Articles | Metrics | Comments0
    Feature-level Multimodal Fusion for Depression Recognition
    GU Ming-xuan, FAN Bing-bing
    Computer and Modernization    2023, 0 (10): 17-22.   DOI: 10.3969/j.issn.1006-2475.2023.10.003
    Abstract626)      PDF(pc) (1213KB)(335)       Save
    Abstract: Depression is a common psychiatric disorder. However, the existing diagnostic methods for depression mainly rely on scales and interviews with psychiatrists, which are highly subjective. In recent years, researchers have devoted themselves to identifying depressed patients by EEG features or audio features, but no study has effectively combined EEG information with audio information, ignoring the correlation between audio and EEG data. Therefore, this study proposes a feature-level multimodal fusion model to improve the accuracy of depression recognition. We combine the audio and EEG modality information based on a fully connected neural network. Our experiments show that the accuracy of depression recognition using feature-level multimodal fusion model on the MODMA dataset reaches 81.58%, which is higher than that of using single-modality. The results indicate that the feature-level multimodal fusion model can improve the accuracy of depression recognition compared to single-modality. Our research provides a new perspective and method for depression recognition.

    Reference | Related Articles | Metrics | Comments0
    Automatic Arrangement Method of Cloud Network Security Service Chain Based on SRv6 Technology
    WANG Hong-jie, XU Sheng-chao, YANG Bo, MAO Ming-yang, JIANG Jin-ling
    Computer and Modernization    2024, 0 (01): 1-5.   DOI: 10.3969/j.issn.1006-2475.2024.01.001
    Abstract164)      PDF(pc) (1156KB)(326)       Save
    Abstract: To improve the resource utilization rate of cloud network data centers and save communication costs, a cloud network security service chain automatic orchestration method is designed based on SRv6 (Segment Route IPv6) technology. The method assists and guides network data packets to pass through the cloud network along the specified path, determines the specific forwarding path of the message, and reduces dependence on service nodes; establishes an objective function to minimize the total bandwidth, combines with various constraints to meet the security requirements of automatic orchestration; defines local behavior message, constructs automatic arrangement framework of security service chain, establishes security service policy, solves policy conflict and flow network scheduling problem, and achieves security arrangement of service chain. Experimental results show that the proposed method can effectively implement the automatic scheduling of cloud service chain, reduce the average total bandwidth consumption of CPU, improve the success rate of user requests, reduce the load of edge device in the cloud, and save communication costs.
    Reference | Related Articles | Metrics | Comments0
    View Frustum Culling Algorithm for Scene Based on Optimized Octree
    LI Ying-ying, HUANG Wen-pei
    Computer and Modernization    2024, 0 (01): 103-108.   DOI: 10.3969/j.issn.1006-2475.2024.01.017
    Abstract144)      PDF(pc) (1038KB)(323)       Save
    Abstract:Large-volume 3D models are prone to low rendering frame rate, slow display and large resource consumption on the browser side. The reason is that such models usually contain hundreds of millions of triangular slices, which cannot be loaded and rendered quickly in a limited time. To address such problems, a scene view frustum culling algorithm based on an optimized octree is proposed. The algorithm adopts address code (Morton code), node view distance criterion and on-demand incremental division technique, which makes the octree adaptive with good compression efficiency; it adopts double bounding volume and base intersection test techniques to improve the accuracy of view frustum culling and achieves the overall goal of improving rendering frame rate and smooth display. The high-speed train example model study shows that the proposed algorithm improves the average rendering frame rate by about 14 frames and the spatial compression rate by about 37.8 percentage points compared with the traditional octree view frustum culling algorithm.
    Reference | Related Articles | Metrics | Comments0
    Review of Infrared Small Target Detection
    HU Rui-jie, CHE Dou
    Computer and Modernization    2023, 0 (08): 79-86.   DOI: 10.3969/j.issn.1006-2475.2023.08.013
    Abstract510)      PDF(pc) (5630KB)(314)       Save
    bstract: This article aims to review three infrared small target detection methods based on traditional feature extraction, local comparison, and widely used deep learning today. Then, by comparing the cutting-edge applications of these three methods, their advantages and disadvantages in target detection performance, robustness, and real-time performance are analyzed. We find that feature extraction based methods exhibit good real-time and robustness in simple scenarios, but may have limitations under complex conditions. The method based on local comparison is relatively robust to changes in object size and shape, but sensitive to background interference. The method based on deep learning performs well in object detection performance, but requires large-scale data and larger computing resources. Therefore, in practical applications, the advantages and disadvantages of these methods should be comprehensively considered based on specific scenario requirements, and appropriate methods should be applied to infrared small target detection.
    Related Articles | Metrics | Comments0
    Design Scheme of User Clustering and Power Distribution for Millimeter-Wave Massive#br# MIMO-NOMA Systems#br#
    LI Wang-wang, HUANG Xue-jun
    Computer and Modernization    2024, 0 (02): 29-35.   DOI: 10.3969/j.issn.1006-2475.2024.02.005
    Abstract97)      PDF(pc) (2532KB)(312)       Save
    Abstract: To solve the problem of high computational complexity in millimeter wave massive multiple-input multiple-output non-orthogonal multiple access (MIMO-NOMA) systems, a scheme of user clustering and power distribution is proposed to improve the spectral efficiency. Firstly, user clustering scheme based on cluster-head selection is improved, in which the threshold and the number of clusters are determined dynamically according to the real channel. The clustering result is more suitable for the actual situation, and users can get greater gain from the beams. Then power allocation is designed with the goal of maximizing the weighted sum of spectral efficiency and energy efficiency of the system and solved by an improved meta-heuristic algorithm. By introducing new vector components to Particle Swarm Optimization (PSO) algorithm and adding cosine perturbation, the algorithm can converge to the global optimal value more quickly. Sand Cat Swarm Optimization (SCSO) algorithm is integrated to make the algorithm more accurate. The simulation results show that compared with the existing algorithms, the spectral efficiency and energy efficiency of the proposed scheme are better than the traditional schemes, and it is more suitable for multi-user cases.
    Reference | Related Articles | Metrics | Comments0
    Breast Cancer Immunohistochemical Image Generation Based on Generative Adversarial Network
    LU Zi-han1, ZHANG Dong1, YANG Yan1, YANG Shuang2
    Computer and Modernization    2024, 0 (03): 92-96.   DOI: 10.3969/j.issn.1006-2475.2024.03.015
    Abstract128)      PDF(pc) (2044KB)(287)       Save

    Abstract: Breast cancer is a dangerous malignant tumor. In medicine, human epidermal growth factor receptor 2(HER2)levels are needed to determine the aggressiveness of breast cancer in order to develop a treatment plan, this requires immunohistochemical(IHC)staining of the tissue sections. In order to solve the problem that IHC staining is expensive and time-consuming, firstly, a HER2 prediction network based on mixed attention residual module is proposed, and a CBAM module is added to the residual module, so that the network can focus on learning at the spatial and channel levels. The prediction network could directly predict HER2 level from HE stained sections, and the prediction accuracy reached more than 97.5%, which increased by more than 2.5 percentage points compared with other networks. Subsequently, a multi-scale generative adversarial network is proposed, which uses ResNet-9blocks with mixed attention residuals module as generator and PatchGan as discriminator and self-defines multi-scale loss function. This network can directly generate simulated IHC slices from HE stained slices. At low HER2 level, SSIM and PSNR between the generated image and the real image are 0.498 and 24.49 dB.

    Reference | Related Articles | Metrics | Comments0
    A Moving Object Detection Algorithm Aiming at Jittery Drone Videos
    LIU Yaoxin1, CHEN Renxi2, YANG Weihong1
    Computer and Modernization    2024, 0 (05): 99-103.   DOI: 10.3969/j.issn.1006-2475.2024.05.017
    Abstract136)      PDF(pc) (2681KB)(281)       Save
    Abstract: To solve the problem that moving object detection is susceptible to jitter in hovering drones, leading to the generation of a significant amount of background noise and lower accuracy, a multiscale EA-KDE (MEA-KDE) background difference algorithm is proposed. This algorithm initially achieves a multiscale decomposition of image sequences to obtain a multiscale image sequence. Subsequently, before performing detection, the segmentation threshold for detection is calculated and updated by considering the area threshold and the current image frame, thereby incorporating information from the current frame. Background difference operations using high and low dual segmentation thresholds are performed on images at different scales to enhance detection robustness. Finally, a top-down fusion strategy is employed to merge the detection results from various scales, preserving the clear contours of the targets while eliminating noise. Furthermore, a proposed boundary expansion fusion post-processing algorithm helps alleviate the fragmented targets caused by detection breaks. Experimental results demonstrate that the proposed algorithm effectively suppresses background noise caused by jitter. On two real drone datasets, average F1 scores of 0.951 and 0.952 were obtained, representing improvements of 0.144 and 0.276, respectively, compared to the original algorithm.

    Reference | Related Articles | Metrics | Comments0
    Debris Flow Infrasound Signal Recognition Approach Based on Improved AlexNet
    YUAN Li1, 2, LIU Dun-long1, 2, SANG Xue-jia1, 2, ZHANG Shao-jie3, CHEN Qiao4
    Computer and Modernization    2024, 0 (03): 1-6.   DOI: 10.3969/j.issn.1006-2475.2024.03.001
    Abstract142)      PDF(pc) (3108KB)(272)       Save
    Abstract: Environmental interference noise is the main challenge for on-site monitoring of debris flow infrasound, which greatly limits the accuracy of debris flow infrasound signal identification. In view of the performance of deep learning in acoustic signal recognition, this paper proposes a debris flow infrasound signal recognition method based on improved AlexNet network, which effectively improves the accuracy and convergence speed of debris flow infrasound signal recognition. Firstly, the original infrasound data set is preprocessed such as data expansion, filtering and noise reduction, and wavelet transform is used to generate a time-frequency spectrum image. Then the obtained time-frequency spectrum image is used as input, and an improved AlexNet network model is built by reducing the convolution kernel, introducing a batch normalization layer and selecting the Adam optimization algorithm. Experimental results show that the improved AlexNet network model has a recognition accuracy of 91.48%, achieves intelligent identification of debris flow infrasound signals and provides efficient and reliable technical support for debris flow infrasound monitoring and early warning.
    Related Articles | Metrics | Comments0
    3D Visualization Monitoring System of Aluminum Reduction Cell Based on Digital Twin
    ZHANG Gaoyi1, XU Yang1, 2, CAO Bin1, 3, LI Yifei3
    Computer and Modernization    2024, 0 (05): 104-109.   DOI: 10.3969/j.issn.1006-2475.2024.05.018
    Abstract93)      PDF(pc) (3159KB)(271)       Save
    Abstract:The traditional management of aluminum electrolytic cell has some problems, such as single management mode, low transparency and weak form of parameter data presentation. In order to solve these problems, digital twin technology is introduced and applied to aluminum electrolytic cell. Based on the theoretical model and framework of digital twin, a six-dimensional model of three-dimensional visual monitoring system of digital twin aluminum electrolytic cell is proposed. Based on this model, the virtual model, scene optimization, data acquisition and data mapping of electrolytic cell are constructed. The data interface is provided by Java background, and the model and data are rendered by using three.js three-dimensional technology and JavaScript language. The system provides more intuitive display effect for field personnel to better understand the operation of aluminum electrolytic cell, and provides effective ideas for the intelligent development of aluminum industry.

    Reference | Related Articles | Metrics | Comments0
    Improved DOA Based on PWLCM and Bald Eagle’s Swooping Mechanism
    OU Ji-fa, CAI Mao-guo, HONG Guang-jie, ZHAN Kai-jie
    Computer and Modernization    2024, 0 (01): 109-116.   DOI: 10.3969/j.issn.1006-2475.2024.01.018
    Abstract112)      PDF(pc) (1163KB)(263)       Save
    Abstract: Aiming at the problems of slow convergence speed and low optimization accuracy of dingo optimization algorithm (DOA), an improved dingo optimization algorithm (IDOA) based on PWLCM and the bald eagle’s swooping mechanism is proposed. Firstly, a piecewise linear chaotic map with eriodicity is used to initialize the dingo population, effectively increasing the diversity of the dingo population. Secondly, the bald eagle’s swooping mechanism is introduced into the persecution strategy to  accelerate the speed of prey capture and strengthen the ability of the algorithm to explore local areas. Finally, the spiral search factor is introduced into the scavenger strategy to enhance the local development and exploration ability of the algorithm, so as to further improve the optimization speed and accuracy of the algorithm. Simulation experiment data, ablation experiment and Wilcoxon rank sum test all show that the proposed IDOA has better optimization speed and optimization accuracy than other comparison algorithms; Compared to other improved dingo optimization algorithms, the proposed IDOA shows better overall performance.
    Reference | Related Articles | Metrics | Comments0
    Unsupervised Domain Adaptation for Outdoor Point Cloud Semantic Segmentation
    HU Chong-jia, LIU Jin-zhou, FANG Li
    Computer and Modernization    2024, 0 (01): 74-79.   DOI: 10.3969/j.issn.1006-2475.2024.01.012
    Abstract191)      PDF(pc) (2419KB)(261)       Save

    Abstract: An unsupervised domain adaptation for LiDAR semantic segmentation method is proposed to deal with the problem of excessive data required for semantic segmentation network training in outdoor large-scale scenes. The method uses a modified RandLA-Net for semantic segmentation using a small number of point clouds from the SPTLS3D’s real world data as target objects. The model finishes the pre-training of the segmentation network on SensatUrban, and completes the transfer task by minimizing the domain gap between the source and target domains. The RandLA-Net losses the global features of the original point cloud in the encoding process, so an additional method of obtaining global information to join the network decoding is proposed. In addition, for getting the differentiated information, the weights of the local attention module of RandLA-Net is changed to use the difference between the features of each point and the average features of its neighbors. The experiments show that the mean intersection over union  of the network are 54.3% on SemanticKITTI and 71.91% on Semantic3D. The mIoU of the pre-trained network after fine-tuning are 80.05%, which is 8.83  percentage points better than training directly.

    Reference | Related Articles | Metrics | Comments0
    Path Planning of Parking Robot Based on Improved D3QN Algorithm
    WANG Jian-ming1, WANG Xin1, LI Yang-hui2, WANG Dian-long1
    Computer and Modernization    2024, 0 (03): 7-14.   DOI: 10.3969/j.issn.1006-2475.2024.03.002
    Abstract168)      PDF(pc) (2440KB)(252)       Save
    Abstract: The parking robot emerges as a solution to the urban parking problem, and its path planning is an important research direction. Due to the limitations of the A* algorithm, the deep reinforcement learning idea is introduced in this article, and improves the D3QN algorithm. Through replacing the convolutional network with a residual network and introducing attention mechanisms, the SE-RD3QN algorithm is proposed to improve network degradation and convergence speed, and enhance model accuracy. During the algorithm training process, the reward and punishment mechanism is improved to achieve rapid convergence of the optimal solution. Through comparing the experimental results of the D3QN algorithm and the RD3QN algorithm with added residual layers, it shows that the SE-RD3QN algorithm achieves faster convergence during model training. Compared with the currently used A*+TEB algorithm, SE-RD3QN can obtain shorter path length and planning time in path planning. Finally, the effectiveness of the algorithm is further verified through physical experiments simulating a car, which provides a new solution for parking path planning.
    Reference | Related Articles | Metrics | Comments0
    An Improved YOLOv5-based Method for Dense Pedestrian Detection Under Complex Road Conditions
    SUN Ruiqi1, DOU Xiuchao2, LI Zhihua1, JIANG Xuemei2, SUN Yuhao1
    Computer and Modernization    2024, 0 (05): 85-91.   DOI: 10.3969/j.issn.1006-2475.2024.05.015
    Abstract116)      PDF(pc) (2884KB)(242)       Save
    Abstract: Aiming at the problem of low pedestrian detection accuracy in complex street scene environment, a new network YOLO-BEN is proposed based on the improvement of YOLOv5 network. The network uses a residual connection module Res2Net with hierarchical system to integrate with C3 module,enhancing fine-grained multi-scale feature representation. The paper adopts the Bi-level routing attention module to construct and prune a region level directed graph, and applies fine-grained attention in the union of routing regions, enabling the network to have dynamic query aware sparsity and improving the feature extraction ability of fuzzy images. We incorporate the EVC module to preserve local corner area information and compensate for the problem of information loss caused by occluded pedestrians. In this paper, NWD metric and original IoU metric are used to form a joint loss function, and a small target detection head is added to improve the effect of long-distance pedestrian detection. In the experiment, the method has achieved good results on self-made data sets and some WiderPerson data sets. Compared with the original network, the accuracy, recall and average accuracy of the improved network are increased by 2.8, 4.3 and 3.9 percentage points respectively.

    Reference | Related Articles | Metrics | Comments0
    Multi-view Knowledge-aware Recommender System
    WANG Xiao-xia, MENG Jia-na, JIANG Feng, DING Zi-qing
    Computer and Modernization    2024, 0 (02): 100-107.   DOI: 10.3969/j.issn.1006-2475.2024.02.016
    Abstract125)      PDF(pc) (2064KB)(240)       Save
    Abstract: At present, most of the recommendation methods based on knowledge graph use single user or item representation, which has the problems of user interest interference, incomplete use of information and sparse data. This paper proposes a multi-view knowledge-aware recommendation model (MVKA). Firstly, the model captures the user’s interest representation in the user-item graph fusion attention mechanism. Introduce the project-entity diagram, the graph attention network is used for feature extraction to obtain the embedded representation of the item. Then, a comparative learning method of graph perspective is constructed between the two views. Finally, summation and concatenation operations are carried out to get the final representation of the user and the project, and the matching score of the user to the project is predicted by the inner product. In order to verify the accuracy and computational efficiency of the experiment, a large number of experiments were carried out on the three public datasets of MovieLens-1M, Book-crossing and Last FM, and compared with other traditional methods and graph neural network models, the AUC and F1 value evaluation indicators were significantly improved, indicating that the MVKA model can significantly use various information relationship data to improve the knowledge perception recommendation task.
    Reference | Related Articles | Metrics | Comments0
    Review of Fall Detection Technologies for Elderly
    WANG Mengxi, LI Jun
    Computer and Modernization    2024, 0 (08): 30-36.   DOI: 10.3969/j.issn.1006-2475.2024.08.006
    Abstract400)      PDF(pc) (2530KB)(231)       Save
     With the rapidly growing aging population in China, the proportion of the elderly living alone has significantly increased, and thus the aging-population-oriented facilities have received increased attention. In a domestic environment, the elderly are likely to fall down due to different reasons such as lack of care, aging, and sudden illness, which have become one of the main threats to their health. Therefore, monitoring, detecting and predicting fall down behavior of the elderly in real-time can ensure their safety to some extent, while further reducing the life and health risks caused by accidental falling down. Based on a comprehensive overview of the research on human fall detection, we categorize fall detection into two categories: vision-free technologies and computer vision based methods, depending on different kinds of sensors used for data acquisition. We summarize and introduce the system composition of different methods and explore the latest relevant research, and discuss their method characteristics and practical applications. In particular, we focus on reviewing the deep learning based schemes which have been developing rapidly in recent years, while analyzing and discussing relevant principles and research results of deep learning based schemes in details. Next, we also introduce public benchmarking datasets for human fall detection, including dataset size and storage format. Finally, we discuss the prospect for the relevant research, and come up with reasonable suggestions in different aspects.
    Reference | Related Articles | Metrics | Comments0
    Review of Research on Human Behavior Detection Methods Based on Deep Learning
    SHEN Jia-wei, LU Yi-ming, CHEN Xiao-yi, QIAN Mei-ling, LU Wei-zhong,
    Computer and Modernization    2023, 0 (09): 1-9.   DOI: 10.3969/j.issn.1006-2475.2023.09.001
    Abstract689)      PDF(pc) (2112KB)(229)       Save
    Human behavior recognition has always been a hot topic of research in the field of computer vision and video understanding and is widely used in other areas such as intelligent video surveillance and human-computer interaction in smart homes. While traditional human behavior detection algorithms have the disadvantages of relying on too many data samples and being susceptible to environmental noise, evolving deep learning techniques are gradually showing their advantages and can be a good solution to these problems. Based on this, this paper firstly introduces some commonly used behavioral recognition datasets and analyses the current research status of human behavioral recognition based on deep learning, then describes the basic process of behavioral recognition and commonly used behavioral recognition methods, finally summarizes the performance, existing problems of various existing behavioral recognition methods, and outlooks the future development directions.
    Reference | Related Articles | Metrics | Comments0
    A Temperature Field Reconstruction Method of Furnace Tube Based on Bidirectional Multistep Prediction
    LIN Qi-zhao, PENG Zhi-ping, GUO Mian, CUI De-long
    Computer and Modernization    2024, 0 (01): 53-58.   DOI: 10.3969/j.issn.1006-2475.2024.01.009
    Abstract126)      PDF(pc) (1682KB)(225)       Save




    Abstract: Aiming at the difficulty of sensing the tube temperature in cracking furnace under high temperature closed ethylene cracking environment, a method of surface temperature field reconstruction of cracking furnace tube based on fusion mechanism and Long Short-Term Memory(LSTM) is proposed. Firstly, the mechanism model of ethylene cracking reaction is constructed based on fluent, a computational fluid dynamics simulation platform, which is used to describe the mathematical relationship between cracking reaction and furnace tube temperature. Then, the mechanism model is numerically corrected and the process parameters are solved using the industrial field data. Major process parameters with strong applicability are determined based on Pearson correlation coefficient. Based on this, a convolutional block attention module (CBAM) is designed to extract the characteristics of the main process parameters reflecting the relationship between the cracking reaction and the temperature of the furnace tube. Finally, a bidirectional multistep prediction model (GA-BMLSTM) is designed based on genetic algorithm and long and short memory neural network to predict the temperature distribution of furnace tubes. Experimental results show that this method has high accuracy and applicability to the reconstruction of temperature field of furnace tube.
    Key words: ethylene cracking furnace; temperature field reconstruction; computational fluid dynamics; attention mechanism; genetic algorithm
    Reference | Related Articles | Metrics | Comments0
    Visual Servo Based on Model-free Adaptive Control
    PENG Zong-yu, HUANG Kai-qi, SU Jian-hua, WANG Li-li
    Computer and Modernization    2024, 0 (01): 29-34.   DOI: 10.3969/j.issn.1006-2475.2024.01.005
    Abstract144)      PDF(pc) (1958KB)(214)       Save
    Abstract: The traditional robot visual servo control technology requires accurate dynamics and kinematics models of known robots and the calibration of camera. However, due to the errors in the robot modeling and camera calibration, it is difficult to accurately build the error model, which affects the positioning accuracy and convergence speed of the robot vision servo system. To solve this problem, this paper proposes a robot vision servo technology based on Model-free Adaptive Control (MFAC). Using the input and output data of the system, this paper realizes adaptive visual servo control. Namely by the Jacobian matrix in the MFAC online estimation robot servo controller and combining with sliding mode controller, this paper achieves the precise tracking task to targets. The results of simulation experiments show that the proposed method can ensure the smooth convergence of the servo controller under the unknown disturbance caused by the change of system parameters and reduce the system positioning error.
    Reference | Related Articles | Metrics | Comments0
    An Autonomous Navigation Method for Intelligent Vehicles in Urban Battlefield
    LI Peng, XU Luo
    Computer and Modernization    2024, 0 (01): 92-98.   DOI: 10.3969/j.issn.1006-2475.2024.01.015
    Abstract114)      PDF(pc) (2793KB)(211)       Save
    Abstract: The urban battlefield is the main position of conventional warfare and daily security, and excellent urban battlefield penetration capabilities can help our fighters better and faster complete reconnaissance, strike, rescue and other tasks. However, the complex street environment in the city, and the possibility of interception by enemy targets, make the urban battlefield environment complex and changeable, greatly increasing the difficulty of completing the mission. Traditional path planning methods rely on accurate static maps and rule constraints, and lack flexibility and adaptability. Therefore, this paper proposes an autonomous navigation method for intelligent vehicles in urban battlefield, and designs discrete action spaces and reward functions based on task completion. Firstly, this paper takes the urban battlefield penetration task as an example to design the state space and action space, and selects a suitable deep reinforcement learning algorithm. Then, based on Gazebo simulation platform and ROS, the algorithm flow framework and experimental scheme are designed. The experimental results show that the intelligent car using this method in the urban battlefield environment can effectively pass through obstacles and avoid enemy units to reach the designated place, which improves the success rate of penetration.
    Reference | Related Articles | Metrics | Comments0
    Survey on Gesture Recognition and Interaction
    WEI Jiakun, WANG Jiarun
    Computer and Modernization    2024, 0 (08): 67-76.   DOI: 10.3969/j.issn.1006-2475.2024.08.012
    Abstract300)      PDF(pc) (1322KB)(207)       Save
    Gesture recognition and interaction technology is the cornerstone task of frontier research in human-computer interaction technology and artificial intelligence technology. This task takes the collaborative work of computers and devices to recognize and process gesture information and give machine operations corresponding to gestures as the main goal, and integrates a number of technologies such as motion capture, image processing, image classification, and multi-terminal collaborative interaction, which is a powerful guarantee to support the command and control system, robot interaction, medical operation and other cutting-edge intelligent interaction and human-computer interaction work nowadays. At present, the research on gesture recognition and interaction has become more and more mature with a wide range of application fields and rich application scenarios. This paper mainly provides a review of the gesture recognition development and interaction related technologies and hardware. Firstly, it sorts the research progress of gesture recognition and interaction technology out comprehensively, and categories the key steps of gesture recognition at the same time. Secondly, it classifies and elaborates the related work of the current mainstream gesture recognition depth sensors used for 3D gesture interaction. Subsequently, it analyses and discusses the real sense recognition technology for 3D gesture recognition. Finally, it analyses the deficiencies and urgent problems in gesture recognition and interaction technology, proposes the integration of such cutting-edge technologies as deep learning, pattern recognition and other feasible research ideas and methods, and makes predictions and prospects for the future research direction, technology development and application areas in this field.
    Reference | Related Articles | Metrics | Comments0
    Image Dehazing Algorithm with Improved Generative Adversarial Network
    LIU Yan-hong, YANG Qiu-xiang
    Computer and Modernization    2024, 0 (02): 56-63.   DOI: 10.3969/j.issn.1006-2475.2024.02.009
    Abstract163)      PDF(pc) (4948KB)(205)       Save
    Abstract: In hazy weather, visible light scattering and absorption occur when passing through the atmosphere, resulting in poor image quality, information blocking or loss. Based on this, we propose an improved generative adversarial network (GAN) image dehazing algorithm, which learns to generate dehazed images in the generator and discriminator adversarial. In the generator, a three-row multi-column multi-scale fused attention network (Grid-G) is proposed to introduce channel attention and pixel attention to process the thick haze region and high frequency region of the image from different angles, respectively. In the discriminator, the high and low frequency information in the image is introduced to construct the fused discriminator (FD-F), which is used as a source of additional a priori discriminative images. Experiments on synthetic and real data in the RESIDE dataset show that the algorithm outperforms the rest of the comparison algorithms in terms of peak signal-to-noise ratio and structural similarity, achieves better dehazing effects, and effectively improves problems such as color distortion.
    Reference | Related Articles | Metrics | Comments0
    Review and Discussion of Personalised News Recommendation Systems
    ZHAI Mei
    Computer and Modernization    2024, 0 (04): 12-20.   DOI: 10.3969/j.issn.1006-2475.2024.04.003
    Abstract274)      PDF(pc) (1534KB)(199)       Save

    Abstract: With the rapid development of news media technology and the exponential growth of the number of online news, personalised news recommendation plays an extremely crucial role in order to solve the problem of online information overload. It learns users' browsing behaviour, interests and other information, and actively provides user with news of interest, thus improving user's reading experience. Personalised news recommendation has become a hot research and practical problem in the field of journalism and computer science, and experts in the industry have proposed various recommendation algorithms to improve the performance of recommendation systems. In this paper, we systematically describe the latest research status and progress of personalised news recommendation. firstly, we briefly introduce the architecture of news recommendation systems, and then we study the key recommendation algorithms and common evaluation metrics in news recommendation systems. Although personalised news recommendation brings a good experience to users, it also brings a lot of unknown effects to users. Unlike other news recommendation reviews, this paper also examines the impact of current news recommendation systems on user behaviour and the problems they face. Finally, the paper proposes research directions and future work on personalised news recommendation based on the current problems encountered.

    Reference | Related Articles | Metrics | Comments0
    Stance Detection with LoRA-based Fine-tuning General Language Model
    HAN Xiaolong, ZENG Xi, LIU Kun, SHANG Yu
    Computer and Modernization    2025, 0 (01): 1-6.   DOI: 10.3969/j.issn.1006-2475.2025.01.001
    Abstract245)      PDF(pc) (2429KB)(196)       Save
     Stance detection is a key task in natural language processing, which determines the stance of an author based on text analysis. Text stance detection methods transition from early machine learning methods to BERT models, and then evolve to the latest large language models such as ChatGPT. Distinguishing from the closed-source feature of ChatGPT, this paper proposes a text stance detection model, ChatGLM3-LoRA-Stance, by using the domestic open-source ChatGLM3 model. In order to apply large models in professional vertical fields, this paper uses LoRA efficient fine-tuning method. Compared with P-Tuning V2 efficient fine-tuning method, LoRA is more suitable for zero-shot and few-shot text stance detection tasks in text. The paper uses the publicly available VAST dataset to fine-tune the ChatGLM3 model, evaluating the performance of existing models in zero-shot and few-shot scenarios. Experimental results indicate that ChatGLM3-LoRA-Stance model has significantly higher F1 scores than other models on zero-shot and few-shot detection tasks. Therefore, the results verify the potential of large language models on text stance detection tasks, and suggest that that the use of LoRA efficient fine-tuning technology can significantly improve the performance of ChatGLM3 large language model in text stance detection tasks.
    Reference | Related Articles | Metrics | Comments0
    Anomalous Behavior Detection Network Based on Dilated Convolution and Fused Temporal#br# Features
    MA Cai-sha, JIAO Li-nan, LIU You-quan, LI Xin
    Computer and Modernization    2024, 0 (02): 75-80.   DOI: 10.3969/j.issn.1006-2475.2024.02.012
    Abstract59)      PDF(pc) (2029KB)(195)       Save
    Abstract: In this paper, we propose a multi-scale deep autoencoder network based on dilated convolution, incorporating pedestrian prototypes and spatio-temporal features. To better exploit the temporal features of pedestrians in videos, a dual-branch structure is added to the potential space of the encoder and decoder, the ST-RNN branch of the recurrent neural network for predicting spatio-temporal features and the memory storage module for preserving the normal patterns of pedestrians. To enhance pedestrian feature extraction, ignore the effect of background information,and improve the generalization ability of the model, an improved atrous spatial pyramid pooling (ASPP) module is added in the encoder, the hybrid dilated convolution (HDC) principle is used in the convolution block to solve the pedestrian size variation problem, while a multi-level residual channel attention mechanism is introduced in the decoder to obtain more contextual information. The corresponding area under the ROC curve (AUC) of this model reaches 0.982, 0.928 for USCD ped2, CUHK Avenue datasets, respectively.
    Reference | Related Articles | Metrics | Comments0
    Multi-layer Bank-enterprise Converged Network Based on Graph Neural Network
    LI Shan, WANG Linna, GAO Dingjia, XUAN Haibo
    Computer and Modernization    2024, 0 (05): 27-32.   DOI: 10.3969/j.issn.1006-2475.2024.05.006
    Abstract104)      PDF(pc) (1346KB)(194)       Save
    Abstract: The potential systemic risk in the financial industry is difficult to be accurately identified. Based on the loan data of the direct systemic risk contagion channel and internet text information of the indirect channel, a multi-layer bank-enterprise network is constructed, and a multi-layer bank-enterprise network convergence model is designed by using graph convolutional neural networks (GCN). Based on the converged network, this paper quantitatively evaluates the systemic risk contagion process of 29 banks and 75 real estate institutions. The converged network analysis shows that the systemic risk transmission capacity under the joint impact of multi-layer bank-enterprise network is significantly greater than the systemic risk of single or two-layer network, and the systemic risk of the inter-enterprise network based on the indirect channel is more obvious. Financial prudential supervision should pay more attention to the ability of data analysis, deep learning and other technologies to integrate big data financial resources and effectively improve the level of risk monitoring and warning.
    Reference | Related Articles | Metrics | Comments0
    Regformer: Hydraulic Prediction Model of Oil Pipeline Based on GS-XGBoost
    LI Ya-ping, WANG Jun-fang, YU Hong-mei, DOU Yi-min, XIAO Yuan, TIAN Ji-lin
    Computer and Modernization    2024, 0 (01): 59-66.   DOI: 10.3969/j.issn.1006-2475.2024.01.010
    Abstract111)      PDF(pc) (5692KB)(190)       Save
    Abstract: Hydraulic pressure drop prediction is very important for production regulation of oil pipelines, and current machine learning methods regard pressure drop prediction as a regression problem, however, pipeline hydraulic calculation is affected by many factors, and the fixed weights obtained from the training set by traditional machine learning methods are difficult to generalize to more test samples or real engineering scenarios. This paper proposes a hydraulic pressure drop regression prediction method, Regformer, which introduces a sparse attention mechanism into the regression task, designs a smoothing probability method based on multi-headed attention, and incorporates a feature projection mechanism. In a comparative experimental analysis with seven mainstream methods on 10 public data sets, qualitative experiments show that Regformer has good fitting ability for local mutations; experiments on hydraulic pressure drop prediction show that the self-attentive method has significant advantages for regression tasks with multivariate uncertainty, especially for extreme cases reflecting the importance of adaptive regression parameters, and Regformer achieves better performance than Transformer with less computation, verifying the superiority of the proposed sparse attention and adaptive feature projection for the hydraulic pressure drop prediction task.
    Reference | Related Articles | Metrics | Comments0
    Chinese Named Entity Recognition with Fusion of Lexicon Information and Sentence Semantics#br# #br#
    WANG Tan, CHEN Jin-guang, MA Li-li
    Computer and Modernization    2024, 0 (03): 24-28.   DOI: 10.3969/j.issn.1006-2475.2024.03.004
    Abstract103)      PDF(pc) (1147KB)(185)       Save
    Abstract: The performance of named entity recognition tasks has significantly improved due to the rapid advancement of deep learning techniques. However, the outstanding results achieved by deep learning networks often rely on large amounts of labeled samples, making it challenging to fully exploit deep information in small datasets. In this paper, we propose a Chinese named entity recognition model (LS-NER) that combines lexicon and sentence semantics. Firstly, potential words matched by characters in the dictionary serve as a priori lexical information for the model, addressing the Chinese word segmentation issue. Then, sentence embeddings containing semantic information, typically used for calculating text similarity, are applied to the named entity recognition task, enabling the model to identify similar entities from analogous sentences. Finally, a feature fusion strategy is devised to allow the model to effectively learn the semantic information provided by sentence embeddings. The experimental results demonstrate that our approach achieves commendable performance on both small datasets Resume and Weibo. The incorporation of sentence semantics assists the model in learning deeper features without requiring additional external information, resulting in F1 scores that are 0.15 percentage points and 2.26 percentage points higher than those of the model without added sentence information, respectively.
    Key words: named entity recognition; BERT; SoftLexicon; Sentence-Bert; CRF
    Reference | Related Articles | Metrics | Comments0
    A mmWave Massive MIMO Channel Estimation Based on Joint Weighted#br# and Truncated Nuclear Norm
    ZHANG Zhineng, HUANG Xuejun
    Computer and Modernization    2024, 0 (04): 1-4.   DOI: 10.3969/j.issn.1006-2475.2024.04.001
    Abstract157)      PDF(pc) (1413KB)(184)       Save
    Abstract: In this paper, a millimeter-wave massive multiple input multiple output (MIMO) channel estimation algorithm based on joint weighted and truncated nuclear norm is proposed. Aiming at the problem of high training and feedback overhead in millimeter-wave massive MIMO channel estimation, firstly, the channel estimation problem is transformed into a low-rank matrix recovery problem by using the sparse antenna angle domain of millimeter-wave channel. An effective and flexible rank function, the joint weighted and truncated kernel norm, is adopted as the relaxation of the nuclear norm, and a new matrix recovery model is constructed for channel estimation. The optimization objective is to minimize the weighted and truncated nuclear norm, and it is solved by an alternating optimization framework. The simulation results show that this method can effectively improve the accuracy of channel estimation and has reliable convergence.
    Reference | Related Articles | Metrics | Comments0
    Temporal Knowledge Graph Question Answering Method Based on#br# Semantic and Structural Enhancement
    HUANG Zheng-lin, DONG Bao-liang
    Computer and Modernization    2024, 0 (03): 15-23.   DOI: 10.3969/j.issn.1006-2475.2024.03.003
    Abstract151)      PDF(pc) (1330KB)(183)       Save
    Abstract: Knowledge graphs, as one of the popular research topics in the field of natural language processing, have consistently received widespread attention from the academic community. In reality, the knowledge quiz process often carries temporal information. Consequently, in recent years, the application of temporal knowledge graphs for knowledge question answering has gained popularity among scholars. Traditional methods for temporal knowledge graph question answering primarily encode the question information to facilitate the inference process. However, they are unable to deal with the more complex entities and temporal relationships contained in the questions. To address this, semantic and structural enhancement for temporal knowledge graph question answering is proposed. This method aims to simultaneously consider both semantic and structural information in the inference process to improve the probability of providing correct answers. Firstly, implicit temporal expressions in the questions are parsed, and the questions are rewritten using direct representations based on the information in the temporal knowledge graph. Additionally, the temporal information in the temporal knowledge graph is aggregated according to different time granularities based on the question set. Secondly, the semantic information of the questions is represented and fused based on entity and time information to enhance the learning of entity and time semantics. Subsequently, subgraphs are extracted based on the extracted entities, and the structural information of the subgraphs is captured using graph convolutional networks. Finally, the fused semantic and structural information of the questions are concatenated, and candidate answers are scored, with the entity receiving the highest score selected as the answer. Comparative tests on MultiTQ data sets show that the proposed model outperforms other baseline models.

    Reference | Related Articles | Metrics | Comments0
    Image Classification of COVID-19 Based on Contrast Learning MocoV2
    XU Yue-wen1, LI Ming1, LI Li2
    Computer and Modernization    2024, 0 (02): 81-87.   DOI: 10.3969/j.issn.1006-2475.2024.02.013
    Abstract91)      PDF(pc) (3940KB)(179)       Save
    Abstract: Pneumonia is a common multi-infectious disease that predisposes the elderly and those with weakened immune systems to infection, and early detection can help with later treatment. Factors such as the location, density and clarity of lung lesions can affect the accuracy of pneumonia image classification. With the development of deep learning, convolutional neural network is widely used in medical image classification tasks, however, the learning ability of the network depends on the number of training samples and labels. Aiming at the classification of pneumonia images in computed tomography (CT), a network model based on self-supervised comparative learning (MCLSE) is proposed, which can learn features from unmarked data and improve the accuracy of the network model. Firsly, auxiliary tasks were designed to mine representations from unmarked images to complete pre-training, improving the ability of the model to learn data mapping relationships in vector space. Secondly, the convolutional neural network is used to extract features. In order to effectively capture higher level feature information, the compression excitation network is selected to improve the classification model and the correlation between the feature channels is modeled. Finally, the trained weights are loaded into the improved classification model, and the network is trained again with marked data in the downstream task. Experiments were carried out on open data sets, SARS-CoV-2 CT and CT Scan for COVID-19 Classification. The results show that the accuracy of the MCLSE model in this paper for the overall sample classification reached 99.19% and 99.75%, respectively, which was greatly improved compared with the mainstream model.
    Reference | Related Articles | Metrics | Comments0
    Information Extraction for Aircraft Fault Text
    QIAO Lu, SUN You-chao, WU Hong-lan
    Computer and Modernization    2024, 0 (03): 61-66.   DOI: 10.3969/j.issn.1006-2475.2024.03.010
    Abstract115)      PDF(pc) (1248KB)(176)       Save
    Abstract: In view of the problems of large workload, low efficiency and high cost of manual extraction of aircraft fault information, a method of information extraction based on domain dictionary, rules and BiGRU-CRF model is proposed. Combining the characteristics of aircraft domain knowledge, domain dictionary and template rules are constructed based on aircraft fault text information, and semantic labeling of fault information is carried out. The BiGRU-CRF deep learning model is used for named entity recognition. BiGRU obtaines the semantic relationship of context, and CRF decodes and generates the entity label sequence. The experimental results show that the information extraction method based on domain dictionary, rules and BiGRU-CRF model has an accuracy of 95.2%, which verifies the effectiveness of the method. It can accurately identify the key words in the aircraft fault text, such as time, aircraft type, fault part name, fault part manufacturer and other information. At the same time, according to the domain dictionary and rules to correct the recognition results, effectively improves the efficiency and accuracy of information extraction, and solves the problem of traditional entity extraction model long-term dependence on manual features.

    Reference | Related Articles | Metrics | Comments0
    Scenes Text Modification Network for Uyghur Based on Generative Adversarial Network
    FU Hong-lin, ZHANG Tai-hong, YANG Ya-ting, Aizimaiti Aiwanier, MA Bo
    Computer and Modernization    2024, 0 (01): 41-46.   DOI: 10.3969/j.issn.1006-2475.2024.01.007
    Abstract126)      PDF(pc) (2063KB)(175)       Save
    Abstract: Through the study of scene text detection and recognition in Uyghur languages, it is found that manual acquisition of labeled natural scene text images is time-consuming and labor-intensive. Therefore, artificially synthesized data is used as the main source of training data. To obtain more realistic data,  a scenes text modification network for Uyghur based on generative adversarial network is proposed. The efficient Transformer module is used to construct the network for fully extracting the global and local features of the image to complete the modification of the Uyghur, and a fine-tuning module is added to fine-tune the final results. The model is trained with WGAN thought strategy, which can effectively cope with the problems of pattern collapse as well as gradient explosion. The generalization ability and robustness of the model are verified by text modification experiments in English-English and English-Virginia. Good results are achieved in both objective metrics (SSIM, PSNR) and visual effects, and are validated on real scene datasets SVT and ICDAR 2013.
    Reference | Related Articles | Metrics | Comments0
    Cost-sensitive Convolutional Neural Network for Encrypted Traffic Classification#br# #br#
    ZHONG Hailong1, 2, HE Yueshun1, HE Linlin1, CHEN JIE1, TIAN Ming3, ZHENG Ruiyin4
    Computer and Modernization    2024, 0 (05): 55-60.   DOI: 10.3969/j.issn.1006-2475.2024.05.010
    Abstract87)      PDF(pc) (1046KB)(174)       Save
    Abstract: This paper addresses classification bias and low recognition rates for minority classes in encrypted traffic classification arising from imbalanced data. Traditional convolutional neural networks tend to favor the majority class in such scenarios, prompting a dynamic weight adjustment strategy. In this approach, during each training iteration, sample weights are adaptively adjusted based on feedback from the cost-sensitive layer. If a minority class sample is misclassified, its weight increases, urging the model to focus on such samples in future training. This strategy continually refines the model’s predictions, enhancing minority class recognition and effectively tackling class imbalance. To prevent overfitting, an early stopping strategy is employed, halting training when validation performance deteriorates consecutively. Experiments reveal that the proposed model significantly excels in addressing class imbalance in encrypted traffic classification, achieving accuracy and F1 scores over 0.97. This study presents a potential solution for encrypted traffic classification amidst class imbalance, contributing valuable insights to network security.

    Reference | Related Articles | Metrics | Comments0
    A Remote Sensing Image Change Detection Model Based on CNN-Transformer Hybrid Structure
    XU Ye-tong, GENG Xin-zhe, ZHAO Wei-qiang, ZHANG Yue, NING Hai-long, LEI Tao
    Computer and Modernization    2023, 0 (07): 79-85.   DOI: 10.3969/j.issn.1006-2475.2023.07.014
    Abstract605)      PDF(pc) (2633KB)(173)       Save
    The emergence of convolutional neural network and Transformer model has made continuous progress in remote sensing image change detection technology, but at present, these two methods still have shortcomings. On the one hand, the convolutional neural network cannot model the global information of remote sensing images due to its local perception of convolution kernel. On the other hand, although Transformer can capture the global information of remote sensing images, it cannot model the details of image changes well, and its computational complexity increases quadrally with the resolution of images. In order to solve the above problems and obtain more robust change detection results, this paper proposes a CNN-Transformer Change Detection Network (CTCD-Net) based on convolutional neural network and Transformer hybrid structure. Firstly, CTCD-Net uses convolutional neural network and Transformer based on encoding and decoding structure in series to effectively encode local and global features of remote sensing images, so as to improve the feature learning ability of the network. Secondly, the cross-channel Transformer self-attention module (CSA) and attention feedforward network (A-FFN) are proposed to effectively reduce the computational complexity of Transformer. Full experiments on LEVIR-CD and CDD datasets show that the detection accuracy of CTCD-Net is significantly better than that of other mainstream methods.
    Related Articles | Metrics | Comments0
    An Attention Mechanism-based U-Net Fundus Image Segmentation Algorithm
    ZHANG Zixu, LI Jiaying, LUAN Pengpeng, PENG Yuanyuan
    Computer and Modernization    2024, 0 (05): 110-114.   DOI: 10.3969/j.issn.1006-2475.2024.05.019
    Abstract83)      PDF(pc) (3307KB)(171)       Save
    Abstract: The radius and width of retinal fundus vessels are important indicators for assessing eye diseases, so accurate segmentation of fundus images is becoming increasingly meaningful. In order to effectively assist doctors in diagnosing eye diseases, the paper proposes a new neural network to segment fundus vascular images. The basic idea is to reduce the information loss by improving the traditional U-Net model with the help of an attention fusion mechanism, using Transformer to construct a channel attention mechanism and a spatial attention mechanism, and fusing the information obtained by the two attention mechanisms. In addition, the number of retinal fundus images is relatively small, and the coefficients of the neural network are relatively large, which are prone to overfitting during training, so the DropBlock layer is introduced to solve this problem. The proposed method was validated on the publicly available dataset DRIVE and compared with several state-of-the-art methods. The results show that our method achieved the highest ACC value of 0.967 and the highest F1 value of 0.787. These experimental results demonstrate that the proposed method is effective in segmenting retinal fundus images.

    Reference | Related Articles | Metrics | Comments0