Computer and Modernization

OFDM Channel Estimation Based on Matrix Recovery

ZHANG Jingjing, HUANG Xuejun

2024, 0(05): 1-4. doi:10.3969/j.issn.1006-2475.2024.05.001

Asbtract ( 111 )

PDF (1380KB) ( 127 )

References | Related Articles | Metrics

Abstract： Orthogonal frequency division multiplexing （OFDM） is a crucial technology in channel estimation， this paper proposes an OFDM channel estimation method based on matrix recovery， multiple consecutive OFDM signal in the frequency domain channel is constructed to a channel matrix. Since this channel matrix is low rank， the channel estimation problem can be converted to the weighted truncated kernel norm minimization problem of the channel matrix and the improved Singular Value Thresholding algorithm is used for recovery. The simulation results show that compared with the traditional channel estimation algorithm， the proposed method can use fewer pilot signals when the same precision channel estimation is obtained. Compared with the channel estimation method based on compressed sensing， the proposed method consumes the same amount of pilot frequency but can directly obtain high precision frequency domain estimation of OFDM channel.

Transmission Line Faults Detection Algorithm Based on YOLOX

WU Hengfeng, HOU Xingsong, WANG Huake

2024, 0(05): 5-10. doi:10.3969/j.issn.1006-2475.2024.05.002

Asbtract ( 127 )

PDF (2405KB) ( 164 )

References | Related Articles | Metrics

Abstract：Power system is an important foundation of national life， intelligent detection of transmission line faults has great social and economic value. Aiming at the problem of lack of public datasets in transmission line faults detection scenarios， poor performance when there are multiple scale targets simultaneously， and difficulty in detecting high IoU bounding boxes， a transmission line faults detection method based on improved YOLOX was proposed. First， a transmission line faults detection dataset was set up through acquisition and simulation; then an adaptive multi-scale feature fusion method was proposed to fully use multi-scale features; finally a new loss was proposed to improve the optimization ability of the network for high IoU bounding boxes and solve sample imbalance problem， which effectively improved the detection accuracy. The experimental results show that in the dataset collected in this paper， the proposed algorithm can still achieve 67.48% mAP50:95 while ensuring real-time performance， outperforming the classical algorithms such as EfficientDet and YOLOV5.

Parking Positioning Method for Automatic Guided Vehicle Based on MA-LM Algorithm

ZHANG Yuanchao1, 2, 3, YANG Guizhi1, 2, XUE Guang1, 3, YAO Hanchen3, PENG Jianwei3, DAI Houde3

2024, 0(05): 11-15. doi:10.3969/j.issn.1006-2475.2024.05.003

Asbtract ( 97 )

PDF (1624KB) ( 111 )

References | Related Articles | Metrics

Abstract： To address the challenge that autonomous navigation parking and charging solutions have poor positioning accuracy at long distances， resulting in AGVs not being able to align with the charging pile in automatic charging back mode， a parking positioning method based on an improved mayfly optimization algorithm （MA-LM） is proposed. This method fuses the magnetic nail positioning data from multiple magnetic sensor arrays， thereby improving the position accuracy and attitude accuracy of the parking positioning. To quantitatively evaluate the improvement effect of magnetic nail localization， this method is tested in a charging pile scenario using a sensor array of nine magnetic sensors and a two-wheeled differential speed mobile robot. Compared with the genetic optimization algorithm （GA-LM） and the particle swarm optimization algorithm （PSO-LM）， the experimental results show that the MA-LM algorithm has the localization accuracy of ±1.65 mm and the orientation accuracy of 0.9° in the parking localization.

Layout Analysis Method of Multi-scale Feature Fusion

QIAO Jia, XU Kun, HU Peirong

2024, 0(05): 16-21. doi:10.3969/j.issn.1006-2475.2024.05.004

Asbtract ( 99 )

PDF (7439KB) ( 135 )

References | Related Articles | Metrics

Abstract： Aiming at the problems of list and text misclassification， the difficulty of recognizing small-scale text in tables， and the poor preservation of spatial features in the current document layout element analysis， according to bottom-up thinking， the paper proposes a multi-feature fusion layout analysis method based on SegNet network. In this paper， the MSCAN-SE module is introduced into SegNet to solve the problem of low recognition rate of small-scale elements in tables. The strip features in the attention mechanism MSCAN-SE are used to improve the extraction ability of multi-scale features of the model， so that the network can retain feature information of more scales. Aiming at the problem that the features of list elements and text elements are too similar， the receptive field of the network in the feature extraction process is expanded through the dilated convolution and channel attention branch in the attention mechanism MSCAN-SE. The performance of the proposed method is compared with the classical semantic segmentation network through experiments. The results show that the pixel accuracy of the proposed method on the test set of layout analysis is 97.9%， and the mean intersection over union ratio is 91.7%. Compared with U-Net semantic segmentation model， FCN semantic segmentation model， DeepLabV3+ semantic segmentation model， and SegNet semantic segmentation model， the mean intersection and union ratio is increased by 7.6%， 2.4%， 2.6%and 1.5% respectively.

EEG Classification of Epilepsy Based on Edge-center Network Feature Extraction

LIU Lipei, YANG Xiaoli, LI Zhenwei

2024, 0(05): 22-26. doi:10.3969/j.issn.1006-2475.2024.05.005

Asbtract ( 72 )

PDF (3586KB) ( 95 )

References | Related Articles | Metrics

Abstract: Epilepsy is one of the most common neurological diseases， and accurate seizure detection is crucial for treatment. In order to improve the accuracy of automatic identification and diagnosis of epileptic EEG signals， we design an edge-centered method to construct complex networks. Firstly， the Z-score value of the series was calculated， and the edge time series was constructed by dot product operation. Secondly， the Pearson correlation coefficient was calculated to construct the edge matrix. Finally， the feature parameters are obtained through network analysis， and three classifiers including SVM， K-NN and LR are selected for comparative classification research. The experimental results show that the classification method based on edge center network feature extraction has achieved good results. Among them， LR has the best classification effect for non-ictal and ictal epilepsy， with an accuracy of 99.30%. The results show that the proposed method can effectively extract feature information and provide new ideas for clinical early warning of epilepsy.

Multi-layer Bank-enterprise Converged Network Based on Graph Neural Network

LI Shan, WANG Linna, GAO Dingjia, XUAN Haibo

2024, 0(05): 27-32. doi:10.3969/j.issn.1006-2475.2024.05.006

Asbtract ( 98 )

PDF (1346KB) ( 190 )

References | Related Articles | Metrics

Abstract： The potential systemic risk in the financial industry is difficult to be accurately identified. Based on the loan data of the direct systemic risk contagion channel and internet text information of the indirect channel， a multi-layer bank-enterprise network is constructed， and a multi-layer bank-enterprise network convergence model is designed by using graph convolutional neural networks （GCN）. Based on the converged network， this paper quantitatively evaluates the systemic risk contagion process of 29 banks and 75 real estate institutions. The converged network analysis shows that the systemic risk transmission capacity under the joint impact of multi-layer bank-enterprise network is significantly greater than the systemic risk of single or two-layer network， and the systemic risk of the inter-enterprise network based on the indirect channel is more obvious. Financial prudential supervision should pay more attention to the ability of data analysis， deep learning and other technologies to integrate big data financial resources and effectively improve the level of risk monitoring and warning.

Prediction Method of Mountain Flood Disaster Based on AFSPSO-ν-SVM

CAO Ning1, XU Genqi2, ZHANG Wen3, XU Youwen1, HE Panqing1

2024, 0(05): 33-37. doi:10.3969/j.issn.1006-2475.2024.05.007

Asbtract ( 99 )

PDF (1433KB) ( 106 )

References | Related Articles | Metrics

Abstract： With the development of science and technology， human engineering activities in mountainous areas are becoming increasingly frequent， which exacerbating the frequency of flash floods. Accurately and timely predicting the possibility of mountain flood disasters is of great significance for ensuring engineering safety， reducing economic losses， and improving personnel safety prevention capabilities. The application of artificial intelligence algorithms in predicting mountain flood disasters has become the focus of current researchers. In order to solve the problems of insufficient prediction accuracy caused by sensitivity differences in triggering factors of mountain floods， suboptimal model fitting effect caused by small sample data， and difficulty in determining nonlinear model parameters， the principal component analysis and ν support vector machines are combined for predicting flash floods， using artificial fish swarm algorithm to expand the search range and speed of particles in particle swarm algorithm， and using improved particle swarm algorithm to optimize support vector machine parameters， AFSPSO-ν-SVM probability prediction model for mountain flood disasters is established. Through experiments， the proposed model was compared with BL models， ν-SVM model， PSO-ν-SVM model. The results of experiment show that the proposed model has the smallest error and the fastest speed. The paper provides a new approach for research in the field of flash flood forecasting and warning.

Knowledge Concept Recommendation Based on Meta-path and Attentional Feature Fusion

LIU Yumeng, ZI Lingling, CONG Xin

2024, 0(05): 38-45. doi:10.3969/j.issn.1006-2475.2024.05.008

Asbtract ( 93 )

PDF (1350KB) ( 138 )

References | Related Articles | Metrics

Abstract： In the research of course recommendation， the most of research effort was focused on course or video resource recommendation， only few studies paid attention to the interest or need of users for specific knowledge concept. Existing researches focus primarily on homogeneous graphs， are vulnerable to the problems of user-item relationships sparsity. To copy with the sparsity problem and fully utilize the characteristics of MOOCs datasets with multiple entities and a lot of semantic information in context relationships， a knowledge concept recommendation algorithm based on meta-path and attentional feature fusion was proposed. First， we extracted the content features of each entity and the context features between entities， input the adjacency matrices based on selected meta-paths into the graph convolutional network， and learned the representation of users and concepts under the guidance of the attention mechanism of the two-layer network structure that integrated the feature vectors of the meta-path and potential feature vectors of users and concepts. Finally， these learned user and concept representations were incorporated into an extended matrix factorization framework to predict the preference of concepts for each user. Experimental results on MOOCCube dataset demonstrate that the algorithm attains the best hit rate， the best normalized discounted cumulative gain and the best mean reciprocal ranking than those of BPR， FISM， NAIS， Metapath2vec， and MOOCIR algorithms. The algorithm improves the interpretability and prediction accuracy of the recommendation process to a certain extent， and alleviates the problem of user-item relationships sparisty.

Trajectory Interest Points Mining Based on Label Propagation and Privacy Protection

YUAN Hongwei1, CHANG Lijun1, HAO Jianhuan2, FAN Na2, WANG Chao2, LUO Chuang2, ZHANG Zehui2

2024, 0(05): 46-54. doi:10.3969/j.issn.1006-2475.2024.05.009

Asbtract ( 95 )

PDF (2038KB) ( 97 )

References | Related Articles | Metrics

Abstract： With the popularization of global positioning systems and mobile data collection devices， a large amount of trajectory data has been generated. Mining potential information in trajectory data has important practical significance， but there is a risk of privacy information leakage during the mining process. Therefore， we propose a trajectory interest point mining and data privacy protection mechanism based on label propagation. This mechanism preprocesses the original trajectory dataset， performs density based initial clustering， and then uses an improved label propagation algorithm for clustering. This algorithm incorporates multi-dimensional information of trajectory data in the mining process， improving data utilization and accuracy of interest points. At the same time， a differential privacy protection algorithm based on an improved exponential mechanism is proposed， which can effectively protect users’ privacy information from being leaked. The comparative experimental results show that the proposed method has better performance advantages compared to existing methods， and effectively solves the problem of user privacy information leakage.

Cost-sensitive Convolutional Neural Network for Encrypted Traffic Classification#br# #br#

ZHONG Hailong1, 2, HE Yueshun1, HE Linlin1, CHEN JIE1, TIAN Ming3, ZHENG Ruiyin4

2024, 0(05): 55-60. doi:10.3969/j.issn.1006-2475.2024.05.010

Asbtract ( 85 )

PDF (1046KB) ( 173 )

References | Related Articles | Metrics

Abstract: This paper addresses classification bias and low recognition rates for minority classes in encrypted traffic classification arising from imbalanced data. Traditional convolutional neural networks tend to favor the majority class in such scenarios， prompting a dynamic weight adjustment strategy. In this approach， during each training iteration， sample weights are adaptively adjusted based on feedback from the cost-sensitive layer. If a minority class sample is misclassified， its weight increases， urging the model to focus on such samples in future training. This strategy continually refines the model’s predictions， enhancing minority class recognition and effectively tackling class imbalance. To prevent overfitting， an early stopping strategy is employed， halting training when validation performance deteriorates consecutively. Experiments reveal that the proposed model significantly excels in addressing class imbalance in encrypted traffic classification， achieving accuracy and F1 scores over 0.97. This study presents a potential solution for encrypted traffic classification amidst class imbalance， contributing valuable insights to network security.

A General Platform for Energy Saving and Consumption Reduction of Server-class#br# Environmental Resources

WANG Jia1, ZHANG Yunlong1, JU Weigang1, ZHOU Zhipeng2, MI Chuanmin2

2024, 0(05): 61-68. doi:10.3969/j.issn.1006-2475.2024.05.011

Asbtract ( 84 )

PDF (4301KB) ( 99 )

References | Related Articles | Metrics

Abstract：The energy consumption in the communication field is mainly from the power consumption of servers. With the implementation of the national development concept of “carbon peaking” and “carbon neutrality” at the strategic level， research on energy-saving and consumption-reducing technologies for server-type environmental resources has important industry-leading value. In response to problems such as large manpower input， high costs， low efficiency， and inability to shut down during idle time in the continuous energy-saving process， a new idea of hierarchical management of server-class environmental resources is proposed from the perspective of environmental drive and scenario triggering. Based on this， an energy-saving and consumption-reducing platform framework is designed and developed to control the power consumption of server-type environmental resources through technical means， configure energy-saving strategies with one click， create a strong energy-saving engine， improve energy-saving efficiency， and save costs. This platform has been applied and promoted in the field of environmental resource improvement， and achieved good results.

Research and Application of Key Technologies in Meteorological Service Middle Platform

FENG Xian1, 2, FANG Kun1, QU Youming1, LIU Xiaobo1, SHI Jiachi1, WEN Liheng1

2024, 0(05): 69-74. doi:10.3969/j.issn.1006-2475.2024.05.012

Asbtract ( 109 )

PDF (1376KB) ( 111 )

References | Related Articles | Metrics

Abstract： With the continuous growth of meteorological data and the expansion of application scenarios， traditional data processing models are difficult to meet the needs of various industry and integrated services. In order to solve the difficulties of mass data， complex processing， demand diversification， and high response time requirements in meteorological services， we developed the Hunan meteorological service middle platform based on distributed architecture， and introduced key technologies to support high concurrency services， including adopting standardized processes to achieve unified processing of multi-source heterogeneous data， developing microservice parallel processing modules to improve data processing efficiency， designing dynamic load balancing algorithms to enhance concurrency capabilities， and ensuring operational stability through flow control mechanisms. The test results show that with the application of the above technology and limited basic resource support， the platform can support 5000 concurrent access， displaying average response time 1202 ms. It has achieved positive application effects in supporting cross industry and multi scenario meteorological services such as emergency management， water conservancy， natural resources.

Segmentation and Reconstruction of Left Atrial Fibrosis Based on MR

JIA Ziyu1, HUANG Huan1, HU Chun’ai2, DOU Lina2

2024, 0(05): 75-79. doi:10.3969/j.issn.1006-2475.2024.05.013

Asbtract ( 69 )

PDF (1459KB) ( 74 )

References | Related Articles | Metrics

Abstract: The current mainstream segmentation method for left atrial fibrosis is to manually divide the atrial wall area first， and then use threshold method to extract the fibrosis part within the atrial wall area. This not only requires the operator to have professional background knowledge， but also requires a large workload， and the threshold method is also difficult to accurately segment fibrosis areas with different degrees of severity at the same time. To address the above issues， this paper proposes a new method for segmenting fibrotic regions in MR images. Firstly， the Laplace sharpening algorithm is used to improve the contrast of the fibrotic area， while the kernel correlation filtering algorithm is used to track the target area to remove tissue outside the atrium; Secondly， the segmentation effects of region growth method， active contour method， and Hessian matrix based segmentation algorithm on fibrotic regions were compared， and the most effective segmentation method was selected; Finally， we reconstruct and render the 3D point cloud data of the fiber region. The experimental results show that this method does not require manual segmentation of the cardiac atrial wall region by image， and the segmentation results have high accuracy， which can better assist doctors in diagnosing related diseases.

Non-destructive Detection of Total Acid Content in Pear Based on#br# Visible-near Infrared Spectroscopy

LUO Shuhuan, SUN Wu, YOU Jie, WANG Wei, HU Biwei, JIANG Nan

2024, 0(05): 80-84. doi:10.3969/j.issn.1006-2475.2024.05.014

Asbtract ( 93 )

PDF (1563KB) ( 76 )

References | Related Articles | Metrics

Abstract: Pear as one of the most favored fruit， its total acid content would has a great influnce on pear’s taste and quality， so the application of non-destructive assessment of total acid content in pears shows promising prospects. In this study， the near-infrared spectral data of 240 mature pear samples in northern Jiangxi were collected， take 180 random pear samples as the calibration set and 60 unknown samples as the prediction set. The study and analysis were conducted using 1401 wavelength points in the range of 400~1800 nm， after eliminating noise at the beginning and end of the spectrum. Original spectral data were preprocessed by SG smoothing method and baseline offset correction method， through the Partial Least Squares Regression mathematical model to determine the SG smoothing method has the most significant pretreatment of the original spectral; competitive adaptive reweighted sampling （CARS） and successive projections algorithm （SPA） are used to extract spectral characteristic wavelengths， meanwhile， combining Partial Least Squares Regression and Least Square Support Vector Machine analysis methods to establish the prediction model of total acid content， among them， the CARS+LS-SVM prediction model has the best prediction effect on the total acid content of pear， the R2p value was 0.901， the RPD value was 2.911. Research shows that visible near-infrared spectroscopy is a method to detect the total acid content of pear， combined with the CARS+LS-SVM prediction model， the quantitative detection of pear total acid content can be realized.

An Improved YOLOv5-based Method for Dense Pedestrian Detection Under Complex Road Conditions

SUN Ruiqi1, DOU Xiuchao2, LI Zhihua1, JIANG Xuemei2, SUN Yuhao1

2024, 0(05): 85-91. doi:10.3969/j.issn.1006-2475.2024.05.015

Asbtract ( 110 )

PDF (2884KB) ( 238 )

References | Related Articles | Metrics

Abstract： Aiming at the problem of low pedestrian detection accuracy in complex street scene environment， a new network YOLO-BEN is proposed based on the improvement of YOLOv5 network. The network uses a residual connection module Res2Net with hierarchical system to integrate with C3 module，enhancing fine-grained multi-scale feature representation. The paper adopts the Bi-level routing attention module to construct and prune a region level directed graph， and applies fine-grained attention in the union of routing regions， enabling the network to have dynamic query aware sparsity and improving the feature extraction ability of fuzzy images. We incorporate the EVC module to preserve local corner area information and compensate for the problem of information loss caused by occluded pedestrians. In this paper， NWD metric and original IoU metric are used to form a joint loss function， and a small target detection head is added to improve the effect of long-distance pedestrian detection. In the experiment， the method has achieved good results on self-made data sets and some WiderPerson data sets. Compared with the original network， the accuracy， recall and average accuracy of the improved network are increased by 2.8， 4.3 and 3.9 percentage points respectively.

Multi-view Reconstruction with Local Self-attention and Deep Optimization

YE Senhui, WANG Lei

2024, 0(05): 92-98. doi:10.3969/j.issn.1006-2475.2024.05.016

Asbtract ( 68 )

PDF (3447KB) ( 125 )

References | Related Articles | Metrics

Abstract: To address the issues of high memory and time consumption， low completeness and fidelity of high-resolution reconstruction in multi-view 3D reconstruction， we propose a deep learning-based multi-view reconstruction network. The network consists of a feature extraction module， a cascaded Patchmatch module and a depth map optimization module. First， we design a U-shaped feature extraction module to extract multi-stage feature maps， and introduce local self-attention layers with relative position encoding at each stage， which capture the local details and global context in the images， and enhance the feature extraction performance of the network. Second， we design a deep residual network to fuse the features， and fully utilize the color image prior knowledge to constrain the depth map， and improve the accuracy of depth estimation. We test our network on the public dataset DTU （Technical University of Denmark）， and the experimental results show that our network achieves significant improvement in 3D reconstruction quality. Compared with PatchmatchNet， our network improves the completeness by 6.1% and the overall by 2.5%. Compared with other SOTA （State-Of-The-Art） methods， our network also achieves better performance in both completeness and overall.

A Moving Object Detection Algorithm Aiming at Jittery Drone Videos

LIU Yaoxin1, CHEN Renxi2, YANG Weihong1

2024, 0(05): 99-103. doi:10.3969/j.issn.1006-2475.2024.05.017

Asbtract ( 130 )

PDF (2681KB) ( 274 )

References | Related Articles | Metrics

Abstract： To solve the problem that moving object detection is susceptible to jitter in hovering drones， leading to the generation of a significant amount of background noise and lower accuracy， a multiscale EA-KDE （MEA-KDE） background difference algorithm is proposed. This algorithm initially achieves a multiscale decomposition of image sequences to obtain a multiscale image sequence. Subsequently， before performing detection， the segmentation threshold for detection is calculated and updated by considering the area threshold and the current image frame， thereby incorporating information from the current frame. Background difference operations using high and low dual segmentation thresholds are performed on images at different scales to enhance detection robustness. Finally， a top-down fusion strategy is employed to merge the detection results from various scales， preserving the clear contours of the targets while eliminating noise. Furthermore， a proposed boundary expansion fusion post-processing algorithm helps alleviate the fragmented targets caused by detection breaks. Experimental results demonstrate that the proposed algorithm effectively suppresses background noise caused by jitter. On two real drone datasets， average F1 scores of 0.951 and 0.952 were obtained， representing improvements of 0.144 and 0.276， respectively， compared to the original algorithm.

3D Visualization Monitoring System of Aluminum Reduction Cell Based on Digital Twin

ZHANG Gaoyi1, XU Yang1, 2, CAO Bin1, 3, LI Yifei3

2024, 0(05): 104-109. doi:10.3969/j.issn.1006-2475.2024.05.018

Asbtract ( 89 )

PDF (3159KB) ( 269 )

References | Related Articles | Metrics

Abstract：The traditional management of aluminum electrolytic cell has some problems， such as single management mode， low transparency and weak form of parameter data presentation. In order to solve these problems， digital twin technology is introduced and applied to aluminum electrolytic cell. Based on the theoretical model and framework of digital twin， a six-dimensional model of three-dimensional visual monitoring system of digital twin aluminum electrolytic cell is proposed. Based on this model， the virtual model， scene optimization， data acquisition and data mapping of electrolytic cell are constructed. The data interface is provided by Java background， and the model and data are rendered by using three.js three-dimensional technology and JavaScript language. The system provides more intuitive display effect for field personnel to better understand the operation of aluminum electrolytic cell， and provides effective ideas for the intelligent development of aluminum industry.

An Attention Mechanism-based U-Net Fundus Image Segmentation Algorithm

ZHANG Zixu, LI Jiaying, LUAN Pengpeng, PENG Yuanyuan

2024, 0(05): 110-114. doi:10.3969/j.issn.1006-2475.2024.05.019

Asbtract ( 82 )

PDF (3307KB) ( 169 )

References | Related Articles | Metrics

Abstract： The radius and width of retinal fundus vessels are important indicators for assessing eye diseases， so accurate segmentation of fundus images is becoming increasingly meaningful. In order to effectively assist doctors in diagnosing eye diseases， the paper proposes a new neural network to segment fundus vascular images. The basic idea is to reduce the information loss by improving the traditional U-Net model with the help of an attention fusion mechanism， using Transformer to construct a channel attention mechanism and a spatial attention mechanism， and fusing the information obtained by the two attention mechanisms. In addition， the number of retinal fundus images is relatively small， and the coefficients of the neural network are relatively large， which are prone to overfitting during training， so the DropBlock layer is introduced to solve this problem. The proposed method was validated on the publicly available dataset DRIVE and compared with several state-of-the-art methods. The results show that our method achieved the highest ACC value of 0.967 and the highest F1 value of 0.787. These experimental results demonstrate that the proposed method is effective in segmenting retinal fundus images.

Placenta Ultrasound Image Segmentation

XU Cheng1, ZHANG Yun2, ZENG Xiangjin1

2024, 0(05): 115-119. doi:10.3969/j.issn.1006-2475.2024.05.020

Asbtract ( 66 )

PDF (2234KB) ( 151 )

References | Related Articles | Metrics

Abstract： The shape and size of the placenta in early pregnancy are closely related to clinical outcomes such as fetal growth. Aiming at the time-consuming interactive segmentation method for three-dimensional ultrasound （3DUS） detection of placental size， a new deep learning segmentation network， DEC-U-Net， is designed based on the U-Net architecture. In the U-Net downsampling stage， deep hyperparametric convolution is used instead of 2D convolution and combined with the ECA attention mechanism. However， the accuracy of placenta detailed feature recognition is improved while introducing more parameter quantities. The cross attention mechanism is introduced into jump linking to solve the problems of blurred placental boundaries and uneven contrast. Compared with ordinary U-Net networks， the algorithm in this paper improves the intersection and merge ratio （IoU）， recall rate （Recall）， accuracy （Precision）， and Dice coefficient by 4.14， 9.59， 6.2， and 16.41 percentage points， respectively. The experimental results show that the improved network model has a good segmentation effect and can accurately segment the placenta in ultrasound images.

Recognition of Hypopigmented Skin Diseases Based on Improved MobileNetV3-Small

GAO Geng1, XIAO Fengli2, YANG Fei1

2024, 0(05): 120-126. doi:10.3969/j.issn.1006-2475.2024.05.021

Asbtract ( 82 )

PDF (1576KB) ( 117 )

References | Related Articles | Metrics

Abstract： In traditional hypopigmented skin disease diagnosis， reliance on the subjective clinical experience of dermatologists makes it challenging to ensure timely and accurate diagnoses for every patient. Therefore， there is a pressing need for a rapid， experience-independent diagnostic approach. Convolutional neural network （CNN） exhibits robust feature recognition capabilities， offering a potential solution. Currently， CNN -based diagnostic methods mainly focus on deeper models such as ResNet50. While achieving high accuracy， these models suffer from drawbacks like large parameter sizes， slow inference， and limited usability on mobile devices. To address these issues， this study introduces a novel lightweight CNN model based on MobileNetV3-Small. Firstly， it eliminates the computationally complex Squeeze-and-Excitation （SE） modules found in MobileNetV3-Small， replacing them with more lightweight Efficient Channel Attention （ECA） attention mechanism. Secondly， it employs the convenient and stable Leaky-ReLU activation function. Lastly， it introduces dilated convolutions in the convolutional layers to expand the receptive field. Experimental results indicate that the proposed model significantly reduces parameter size， recognition time and FLOPs compared to existing diagnostic models. It meets the high usability demands of mobile applications while still outperforming in terms of accuracy and F1 score. Ultimately， based on the proposed model， a mobile application for clinical diagnosis of hypopigmented skin disease has been developed.

Table of Content