Computer and Modernization

Select

A Review of Deep Neural Networks Combined with Attention Mechanism

HUANGFU Xiao-ying, QIAN Hui-min, HUANG Min

Computer and Modernization 2023, 0 (02): 40-49.

Abstract （1161）

PDF（pc）（2408KB）（453）

Save

Attention mechanism has become one of the research hotspots in improving the learning ability of deep neural network. In view of the wide attention paid to the attention mechanism， this paper aims to give a comprehensive analysis and elaboration of attention mechanism in deep neural network from three aspects: the classification of attention mechanism， the way of combining with deep neural network， and the specific applications in natural language processing and computer vision. Specifically， attention mechanism has been divided into soft attention mechanism， hard attention mechanism and self-attention mechanism， and their advantages and disadvantages are compared. Then， the common ways of combining attention mechanism in recursive neural network and convolutional neural network are discussed respectively， and the representative model structures of each way are given. After that， the applications of attention mechanism in natural language processing and computer vision are illustrated. Finally， several future developments of attention mechanism are illustrated expecting to provide clues and directions for subsequent researches.

Reference | Related Articles | Metrics | Comments（0）

Select

A Text Classification Model Based on BERT and Pooling Operation

ZHNAG Jun, QIU Long-long

Computer and Modernization 2022, 0 (06): 1-7.

Abstract （918）

PDF（pc）（948KB）（190）

Save

The fine-tuning method using the pre-trained language model has achieved good results in many natural language processing tasks represented by text classification, BERT model based on the Transformer framework as a typical representative especially. However, BERT uses the vector corresponding to ［CLS］ as the text representation directly, and does not consider the local features and global features of texts, which limits the classification performance of the model. Therefore, this paper proposes a text classification model that introduces a pooling operation, and uses pooling methods such as average pooling, maximum pooling, and K-MaxPooling to extract the representation vector of texts from the output matrix of BERT. The experimental results show that compared with the original BERT model, the text classification model with pooling operation proposed in this paper has better performance. In all text classification tasks in the experiment, its accuracy and F1-Score value are better than BERT model.

Reference | Related Articles | Metrics | Comments（0）

Select

Stock Movement Prediction Algorithm Based on Deep Learning

ZHOU Run-jia

Computer and Modernization 2023, 0 (01): 69-73.

Abstract （834）

PDF（pc）（1263KB）（297）

Save

To improve the accuracy of stock movement prediction， this paper proposes a stock movement prediction algorithm AACL（Adversarial Attentive CNN-LSTM）which utilizes CNN and LSTM for feature extraction and combines attention mechanism and adversarial training. The algorithm uses CNN to extract the overall trend information of the stock， LSTM to extract the short-term fluctuation information of the stock， and connects multiple stocks through the attention mechanism to capture the rising and falling relationship between stocks. The algorithm also introduces adversarial training to improve the robustness of the algorithm by interfering the data. To verify the effectiveness of the AACL algorithm， experiments are carried out on three data sets KDD17， ACL18， and China50， and compared with existing algorithms. Experiments results show that the algorithm proposed in this paper can obtain the best result.

Reference | Related Articles | Metrics | Comments（0）

Select

Road Pothole Detection Algorithm Based on Improved YOLOv5s

BAI Rui, XU Yang, WANG Bin, ZHANG Wen-wen

Computer and Modernization 2023, 0 (06): 69-75. DOI: 10.3969/j.issn.1006-2475.2023.06.012

Abstract （760）

PDF（pc）（3457KB）（131）

Save

Aiming at the problem that existing target detection algorithms are difficult to accurately detect road potholes and the detection speed is slow, a road pothole detection algorithm based on improved YOLOv5s is proposed. Firstly, CA （Coordinate attention） module is integrated into YOLOv5s backbone network, so that the model can capture not only cross-channel information, but also direction perception and position sensitive information, which is helpful for the model to locate and identify the detected object more accurately. Then, SoftPool is adopted in Spatial Pyramid Pool （SPP） module to improve the maximum pooling operation and retain more detailed characteristic information. In the feature fusion stage, Content-Aware ReAssembly of FEatures （CARAFE） is used to improve the up-sampling of multi-scale feature fusion and dynamically generate an adaptive kernel, which can gather context information in a large receptive field. Finally, Alpha-IoU is used to improve the loss function and improve the margin regression accuracy. Experimental results show that the average accuracy of the improved YOLOv5s algorithm is 4.6 percentage points higher than that of the original network, and the detection accuracy of the improved YOLOv5s algorithm is greatly improved compared with other mainstream algorithms such as SSD, Faster R-CNN, YOLOv3, YOLOv3-tiny and YOLOv4-tiny.

Reference | Related Articles | Metrics | Comments（0）

Select

Survey of Model Pruning Algorithms

LI Yi, WEI Jian-guo, LIU Guan-wei

Computer and Modernization 2022, 0 (09): 51-59.

Abstract （714）

PDF（pc）（1096KB）（234）

Save

The model pruning algorithms apply different standards or methods to prune the redundant neurons in the deep neural network, which can compress the model to the maximum extent without losing the accuracy of the model, so as to reduce the storage and improve the speed. Firstly, the research status of model pruning algorithm and the main research direction are summarized and classified. The main research areas of model pruning include the granularity of pruning, the method to evaluate the importance of pruning elements, the sparsity of pruning, the theoretical foundation of model pruning, pruning for different tasks and so on. Then, the recent representative pruning algorithms are described in detail as well. Finally, the future research direction in this field is brought forward.

Reference | Related Articles | Metrics | Comments（0）

Select

Review of Research on Human Behavior Detection Methods Based on Deep Learning

SHEN Jia-wei, LU Yi-ming, CHEN Xiao-yi, QIAN Mei-ling, LU Wei-zhong,

Computer and Modernization 2023, 0 (09): 1-9. DOI: 10.3969/j.issn.1006-2475.2023.09.001

Abstract （689）

PDF（pc）（2112KB）（229）

Save

Human behavior recognition has always been a hot topic of research in the field of computer vision and video understanding and is widely used in other areas such as intelligent video surveillance and human-computer interaction in smart homes. While traditional human behavior detection algorithms have the disadvantages of relying on too many data samples and being susceptible to environmental noise， evolving deep learning techniques are gradually showing their advantages and can be a good solution to these problems. Based on this， this paper firstly introduces some commonly used behavioral recognition datasets and analyses the current research status of human behavioral recognition based on deep learning， then describes the basic process of behavioral recognition and commonly used behavioral recognition methods， finally summarizes the performance， existing problems of various existing behavioral recognition methods， and outlooks the future development directions.

Reference | Related Articles | Metrics | Comments（0）

Select

Character Network Analysis of Ordinary World

WANG Jun, HE Jin-rong, MA Le-rong

Computer and Modernization 2022, 0 (06): 32-36.

Abstract （636）

PDF（pc）（1405KB）（161）

Save

The construction and quantitative analysis of the relationship network of characters in literary works is an important content of intelligent interpretation of literary works. This article takes Mr. Lu Yao’s literary work "The Ordinary World" as the research object and uses complex network analysis methods to construct and analyze the social network in the literary works. Firstly, the social network relationship in the work is extracted, where the characters in the novel correspond to the nodes in the network, the relationships between the characters correspond to the edges of the network, and the number of times the characters appear together in each chapter corresponds to the weight of the edges. Then we analyze the betweenness and aggregation coefficient correlation， hierarchical clustering and predicting link on the constructed network. The experimental results show that the character relationship network of Ordinary World is a heterogenous network with small-world characteristics. This research is helpful to promote the analysis of character relationship network in literary works.

Reference | Related Articles | Metrics | Comments（0）

Select

An Improved Whale Optimization Algorithm Base on Hybrid Strategy

LI Ru, FAN Bing-bing

Computer and Modernization 2022, 0 (06): 13-20.

Abstract （628）

PDF（pc）（1185KB）（150）

Save

In order to solve the problems of the original whale optimization algorithm (WOA) with slow convergence speed, weak global search ability, low solution accuracy and easy to fall into local optimization, a hybrid strategy is proposed to improve the whale optimization algorithm (LGWOA). Firstly, the Levy flight strategy is introduced into the position update formula of the whale random search, and the global search step is increased through Levy flight, the search space is enlarged, and the global search capability is improved. Secondly, the adaptive weight is introduced into the whale spiral upward position update formula to improve the algorithm’s local search ability and optimization accuracy. Finally, the idea combining the genetic algorithm’s cross mutation is used to balance the algorithm’s global search and local search capabilities, maintain the diversity of the population, and avoid falling into the local optimum. Simulation experiments on 12 benchmark test functions in different dimensions show that the improved whale algorithm has faster convergence speed and higher optimization accuracy.

Reference | Related Articles | Metrics | Comments（0）

Select

Feature-level Multimodal Fusion for Depression Recognition

GU Ming-xuan, FAN Bing-bing

Computer and Modernization 2023, 0 (10): 17-22. DOI: 10.3969/j.issn.1006-2475.2023.10.003

Abstract （626）

PDF（pc）（1213KB）（335）

Save

Abstract: Depression is a common psychiatric disorder. However， the existing diagnostic methods for depression mainly rely on scales and interviews with psychiatrists， which are highly subjective. In recent years， researchers have devoted themselves to identifying depressed patients by EEG features or audio features， but no study has effectively combined EEG information with audio information， ignoring the correlation between audio and EEG data. Therefore， this study proposes a feature-level multimodal fusion model to improve the accuracy of depression recognition. We combine the audio and EEG modality information based on a fully connected neural network. Our experiments show that the accuracy of depression recognition using feature-level multimodal fusion model on the MODMA dataset reaches 81.58%， which is higher than that of using single-modality. The results indicate that the feature-level multimodal fusion model can improve the accuracy of depression recognition compared to single-modality. Our research provides a new perspective and method for depression recognition.

Reference | Related Articles | Metrics | Comments（0）

Select

Review of Relation Extraction Based on Pre-training Language Model

WANG Hao-chang, LIU Ru-yi

Computer and Modernization 2023, 0 (01): 49-57.

Abstract （614）

PDF（pc）（1190KB）（295）

Save

In recent years, with the continuous innovation of deep learning technology， the application of pre-training models in natural language processing has become more and more extensive, and relation extraction is no longer purely dependent on the traditional pipeline method. The development of pre-training language models has greatly promoted the related research of relation extraction， and has surpassed traditional methods in many fields. First， this paper briefly introduces the development of relationship extraction and classic pre-training models；secondly, summarizes the current commonly used data sets and evaluation methods, and analyzes the performance of the model on each data set; finally， discusses the development challenges of relationship extraction and future research trends.

Reference | Related Articles | Metrics | Comments（0）

Select

Survey of Fruit Object Detection Algorithms in Computer Vision

LI Wei-qiang, WANG Dong, NING Zheng-tong, LU Ming-liang, QIN Peng-fei

Computer and Modernization 2022, 0 (06): 87-95.

Abstract （613）

PDF（pc）（1939KB）（214）

Save

Fruit target detection and recognition based on computer vision is an important cross-disciplinary research topic of target detection, computer vision, agricultural robots, etc. It has important theoretical research significance and practical application value in the fields of smart agriculture, agricultural modernization, and automatic picking robots. As deep learning is widely used in the field of image processing and has achieved good results, fruit target detection and recognition algorithms combining computer vision technology with deep learning methods gradually become the mainstream. This article introduces the tasks, difficulties and development status of fruit target detection and recognition based on computer vision, as well as two types of fruit target detection and recognition algorithms based on deep learning methods. Finally, the public data set used for the training and learning of the algorithm model and the evaluation index for evaluating the performance of the model are introduced, and the current problems in the detection and recognition of fruit targets and the possible future development directions are discussed.

Reference | Related Articles | Metrics | Comments（0）

Select

Flame Detection Algorithm Based on Improved YOLOV5

WANG Hong-yi, KONG Mei-mei, XU Rong-qing

Computer and Modernization 2023, 0 (01): 103-107.

Abstract （607）

PDF（pc）（1474KB）（234）

Save

Aiming at the existing flame detection algorithms having problems of low average detection accuracy and high missed detection rate of small target flames， an improved YOLOV5 flame detection algorithm is proposed. The algorithm uses the Transformer Encode module to replace the CSP bottleneck module at the end of the YOLOV5 backbone network， which enhances the network's ability to capture different local information and improves the average accuracy of flame detection. In addition， the CBAM attention module is added to the YOLOV5 networker， which enhances the network's ability to extract image features， and can better extract features for small target flames， reducing the missed detection rate of small target flames. Experiment with the algorithm on the public datasets BoWFire and Bilkent， the experimental results show that the average flame detection accuracy of the improved YOLOV5 network is higher， reaching 83.9%， the small target flame missed detection rate is lower， only 1.6%， and the detection rate is 34 frames/s. Compared with the original YOLOV5 network， the average accuracy is improved 2.4 percentage points， the small target flame missed detection rate is reduced by 4.1 percentage points， the improved YOLOV5 network can meet the real-time and precision requirements of flame detection.

Reference | Related Articles | Metrics | Comments（0）

Select

A Remote Sensing Image Change Detection Model Based on CNN-Transformer Hybrid Structure

XU Ye-tong, GENG Xin-zhe, ZHAO Wei-qiang, ZHANG Yue, NING Hai-long, LEI Tao

Computer and Modernization 2023, 0 (07): 79-85. DOI: 10.3969/j.issn.1006-2475.2023.07.014

Abstract （605）

PDF（pc）（2633KB）（173）

Save

The emergence of convolutional neural network and Transformer model has made continuous progress in remote sensing image change detection technology， but at present， these two methods still have shortcomings. On the one hand， the convolutional neural network cannot model the global information of remote sensing images due to its local perception of convolution kernel. On the other hand， although Transformer can capture the global information of remote sensing images， it cannot model the details of image changes well， and its computational complexity increases quadrally with the resolution of images. In order to solve the above problems and obtain more robust change detection results， this paper proposes a CNN-Transformer Change Detection Network （CTCD-Net） based on convolutional neural network and Transformer hybrid structure. Firstly， CTCD-Net uses convolutional neural network and Transformer based on encoding and decoding structure in series to effectively encode local and global features of remote sensing images， so as to improve the feature learning ability of the network. Secondly， the cross-channel Transformer self-attention module （CSA） and attention feedforward network （A-FFN） are proposed to effectively reduce the computational complexity of Transformer. Full experiments on LEVIR-CD and CDD datasets show that the detection accuracy of CTCD-Net is significantly better than that of other mainstream methods.

Related Articles | Metrics | Comments（0）

Select

Research Review of Single-channel Speech Separation Technology Based on TasNet

LU Wei, ZHU Ding-ju

Computer and Modernization 2022, 0 (11): 119-126.

Abstract （584）

PDF（pc）（1016KB）（125）

Save

Speech separation is a fundamental task in acoustic signal processing with a wide range of applications. Thanks to the development of deep learning, the performance of single-channel speech separation systems has been significantly improved in recent years. In particular, with the introduction of a new speech separation method called time-domain audio separation network （TasNet）, speech separation technology is also gradually transitioning from the traditional method based on time-frequency domain to the one based on time domain methods. This paper reviews the research status and prospect of single-channel speech separation technology based on TasNet. After reviewing the traditional methods of speech separation based on time-frequency domain, this paper focuses on the TasNet-based Conv-TasNet model and DPRNN model, and compares the improvement research on each model. Finally, this paper expounds the limitations of the current single-channel speech separation model based on TasNet, and discusses future research directions from the aspects of model, dataset, number of speakers, and how to solve speech separation in complex scenarios.

Reference | Related Articles | Metrics | Comments（0）

Select

Helmet-wearing Detection Based on Improved YOLOv5

YUE Heng, HUANG Xiao-ming, LIN Ming-hui, GAO Ming, LI Yang, CHEN Ling

Computer and Modernization 2022, 0 (06): 104-108.

Abstract （578）

PDF（pc）（2508KB）（176）

Save

To the problem that YOLOv5 cannot be focused by weights and cannot produce more distinguishable features, thereby reducing the accuracy of helmet detection, attention module was used. Besides, squeeze and excitation layer and efficient channel attention module were studied. To the problem that the non maximum suppression used by YOLOv5 to remove redundant results will only retain the highest confidence prediction frame of the same class when objects were highly overlapped, the Soft-NMS algorithm was used to keep more prediction boxes. Weighted non maximum suppression was used to fuse multiple prediction boxes information to improve the accuracy of the prediction boxes. For the problem of information loss caused by down-sampling , focus modules was used to improve the detection effect, and the various modules were integrated to obtain the optimal FESW-YOLO algorithm. Compared with YOLOv5, the algorithm improves the mAP@0.5 by 2.1 percentage points and the mAP@0.5:0.95 by1.2 percentage points on the helmet data set respectively, which improves the accuracy of safety helmet supervision.

Reference | Related Articles | Metrics | Comments（0）

Select

Driver Distracted Behavior Recognition Based on Deep Learning

HE Li-wen, ZHANG Rui-chi

Computer and Modernization 2022, 0 (06): 67-74.

Abstract （549）

PDF（pc）（2347KB）（165）

Save

Distracted driving behavior recognition is one of the main methods to improve driving safety. Aiming at the problem of low identification accuracy of distracted driving behavior, this paper proposes a driver distracted behavior recognition algorithm based on deep learning, which is composed of a cascade of target detection network and precise behavior recognition network. Based on the State Farm open data set, in the first level, the target detection algorithm SSD (Single Shot Multibox Detector) is used to extract local information from the original driver images in the data set and determine the candidate regions for behavior recognition. Then in the second level, the transfer learning VGG19, ResNet50 and MobileNetV2 models is used to accuratelyidentify the behavior information in the candidate region. Finally, the experiment compares the recognition accuracy of distracted driving behavior between layered recognition architecture and single model architecture. Results show that compared the proposed cascade network model with the mainstream model of single detection method, the driver behavior identification accuracy is improved 4% ~ 7% overall. Besides, the proposed algorithm not only reduces the influence of noise and other background regions on the model to improve the accuracy of distracted behavior recognition, but also can effectively identify more behavior categories to avoid the misclassification of actions.

Reference | Related Articles | Metrics | Comments（0）

Select

A Point Cloud Registration Algorithm Combining Improved PSO Algorithm and TrICP Algorithm

LIANG Zheng-you , , WANG Lu , LI Xuan-ang , YANG Feng ,

Computer and Modernization 2022, 0 (05): 90-95.

Abstract （540）

PDF（pc）（1109KB）（124）

Save

Aiming at the problem that the traditional iterative closest point (ICP) algorithm is easy to fall into the problem of local optimality when the initial spatial position deviation is large, a point cloud registration method combining improved PSO-TrICP algorithm is proposed. Firstly, the traditional particle swarm optimization (PSO) algorithm is improved by introducing similarity measurement criterion of fitness to adjust the updating mode of particles. Then, the mean value of the historical global optimal solution of each iteration is added as a new learning factor to avoid the phenomenon of “precocity”; Secondly, the rigid transformation parameters and the overlap rate between the point clouds are used to form the particles, and the improved PSO algorithm is used to provide a good initial relative position; Finally, the space transformation between point clouds is estimated with trimmed iterative closest point (TrICP) algorithm. Experimental results show that the improved PSO-TRICP algorithm has better registration accuracy and operation efficiency than the similar registration algorithms proposed in recent years, and has better robustness.

Reference | Related Articles | Metrics | Comments（0）

Select

A Hybrid Brain Tumor Classfication Study Based on CBAM and EfficientNet with Improved Channel Attention

HUA Xin-yu, QI Yun-song

Computer and Modernization 2023, 0 (05): 1-7.

Abstract （522）

PDF（pc）（1818KB）（107）

Save

In order to further improve the accuracy and robustness of brain tumor image diagnosis， a novel hybrid brain tumor classification method based on CBAM（Convolutional Block Attention Module） and EfficientNet with improved channel attention mechanism （IC+IEffxNet） is proposed. The method is divided into 2 stages. In the first stage， the features will be extracted by CBAM model based on improved spatial attention mechanism. In the second stage， the sequence and exception （SE） block in EfficientNet architecture is replaced by the efficient channel attention （ECA） block， and the combined feature output of the first stage is used as the input of the second stage. Experiment shows the 4 classifications of glioma， meningioma， pituitary and normal images from the mixed brain tumor MRI dataset. The results show that the average classification accuracy is about 0.5~2 percentage points higher than the existing methods. The experimental results demonstrate the effectiveness of the method and provide a new reference for medical experts to accurately judge brain tumor.

Reference | Related Articles | Metrics | Comments（0）

Select

Improving Latency and Bandwidth Probe of BBR Congestion Control Algorithm

HUANG Hong-ping, ZHU Xiao-yong, WANG Zhi-yuan,

Computer and Modernization 2022, 0 (10): 113-120.

Abstract （521）

PDF（pc）（2082KB）（139）

Save

The traditional congestion control algorithm based on packet loss can’t meet the requirements of many applications for network performance because of its high packet loss rate and buffer expansion. The BBR (bottleneck bandwidth and round trip) algorithm proposed by Google has attracted extensive attention and research because of its characteristics of anti packet loss, high bandwidth utilization and low delay. However, BBR still has some problems, such as high queuing delay, poor performance in a small RTT (round trip time) environment, untimely bandwidth detection, etc. This paper analyzes the queuing delay and convergence of BBR, and then puts forward an improved method: Limit inflight data, and reduce the congestion window size timely according to the network feedback to reduce the delay; In small RTT environment, the bandwidth estimation before the probe RTT stage is retained to after probe RTT; Set the maximum holding time of steady state, exit the steady cycle in time and enter the detection cycle. The simulation results in NS3 show that the improved BBR reduces the RTT and its jitter, and improves the convergence speed of the algorithm; The bandwidth can be efficiently used in the environment with small RTT; The improved BBR can significantly improve the bandwidth probe frequency of long RTT streams.

Reference | Related Articles | Metrics | Comments（0）

Select

ECG Signal Classification Based on Deep Learning

YU Yan, QIU Lei,

Computer and Modernization 2022, 0 (05): 16-20.

Abstract （518）

PDF（pc）（1182KB）（115）

Save

Electrocardiogram (ECG) can reflect the state of the heart in real time, and can be used for the accurate diagnosis of arrhythmias and other cardiovascular diseases. In view of the noise interference during ECG signal acquisition, we reconstruct the fourth-order components of Db6 wavelet, then use Butterworth low pass filter to realize double denoising. Then, from denoised ECG signals to extract the R-wave, and the P-QRS-T are intercepted and input into the one-dimensional improved GoogLeNet model for training. One-dimensional improved GoogLeNet is an improved structure of the original two-dimensional GoogLeNet, which reduces the network depth and adds the maximum pooled layer and dilated convolution in the sparse connection to increase the receptive field, so as to reduce the amount of calculation and improve the training performance. Experiments on the MIT-BIH data set show that the classification accuracy is 99.39%, which is 0.17 percentage points and 0.22 percentage points higher than the one-dimensional GoogLeNet and the original GoogLeNet respectively, and the training efficiency is improved. Signal classification has a marked improvement over other advanced techniques.

Reference | Related Articles | Metrics | Comments（0）

Select

Review of Infrared Small Target Detection

HU Rui-jie, CHE Dou

Computer and Modernization 2023, 0 (08): 79-86. DOI: 10.3969/j.issn.1006-2475.2023.08.013

Abstract （509）

PDF（pc）（5630KB）（314）

Save

bstract： This article aims to review three infrared small target detection methods based on traditional feature extraction， local comparison， and widely used deep learning today. Then， by comparing the cutting-edge applications of these three methods， their advantages and disadvantages in target detection performance， robustness， and real-time performance are analyzed. We find that feature extraction based methods exhibit good real-time and robustness in simple scenarios， but may have limitations under complex conditions. The method based on local comparison is relatively robust to changes in object size and shape， but sensitive to background interference. The method based on deep learning performs well in object detection performance， but requires large-scale data and larger computing resources. Therefore， in practical applications， the advantages and disadvantages of these methods should be comprehensively considered based on specific scenario requirements， and appropriate methods should be applied to infrared small target detection.

Related Articles | Metrics | Comments（0）

Select

Categorical Data Clustering Based on Extraction of Associations from Co-association Matrix

GUAN Yun-peng, LIU Yu-long

Computer and Modernization 2022, 0 (11): 1-8.

Abstract （507）

PDF（pc）（1703KB）（113）

Save

Categorical data clustering is widely used in different fields in the real world, such as medical science, computer science, etc. The usual categorical data clustering is studied based on the dissimilarity measure. For data sets with different characteristics, the clustering results will be affected by the characteristics of the data set itself and noise information. In addition, the categorical data clustering based on representation learning is too complicated to implement, and the clustering results are greatly affected by the representation results. Based on the co-association matrix, this paper proposes a clustering method that can directly consider the relationship between the original information of categorical data, categorical data clustering based on extraction of associations from co-association matrix (CDCBCM). The co-association matrix can be regarded as a summary of the information association in the original data space. The co-association matrix is constructed by calculating the co-association frequency value of different objects in each attribute subspace, and some noise information is removed from the co-association matrix, and then the clustering result is obtained by normalized cut. The method is tested on 16 publicly available datasets in various aspects, compared with 8 existing methods, and detected using the F1-score metric. The experimental results show that this method has the best effect on 7 data sets, the average ranking is the best, and it can better complete the clustering task of categorical data.

Reference | Related Articles | Metrics | Comments（0）

Select

Enhanced Image Caption Based on Improved Transformer_decoder

LIN Zhen-xian, QU Jia-xin, LUO Liang

Computer and Modernization 2023, 0 (01): 7-12.

Abstract （505）

PDF（pc）（1421KB）（124）

Save

Transformer's decoder model（Transformer_decoder）has been widely used in image caption tasks. Self Attention captures fine-grained features to achieve deeper image understanding. This article makes two improvements to the Self Attention, including Vision-Boosted Attention（VBA）and Relative-Position Attention（RPA）. Vision-Boosted Attention adds a VBA layer to Transformer_decoder, and introduces visual features as auxiliary information into the attention model, which can be used to guide the decoder model to generate more matching description semantics with the image content. On the basis of Self Attention, Relative-Position Attention introduces trainable relative position parameters to add the relative position relationship between words to the input sequence. Based on COCO2014 experiments, the results show that the two attention mechanisms of VBA and RPA have improved image caption tasks to a certain extent， and the decoder model combining the two attention mechanisms has better semantic expression effects.

Reference | Related Articles | Metrics | Comments（0）

Select

Short-term Traffic Flow Prediction Model Based on Deep Learning

ZHANG Ling-yun, HAN Ying, ZHANG Kai, LU Hai-peng, DING Yu-jie

Computer and Modernization 2022, 0 (07): 54-60.

Abstract （497）

PDF（pc）（2733KB）（118）

Save

Traffic flow prediction has important and practical significance in the field of intelligent transportation. Because traffic flow data is affected by many factors, leads to poor stability, strong randomness, and presents a highly non-linear characteristic, it is extremely difficult to predict traffic flow. Aiming at the requirements of the accuracy of short-term traffic flow prediction, this paper proposes a short-term traffic flow prediction method based on CEEMD(Complete Ensemble Empirical Mode Decomposition, CEEMD), combined with CNN(Convolutional Neural Networks, CNN) and LSTM(Long Short-Term Memory, LSTM). The model uses CEEMD signal decomposition to reduce the impact of noise on traffic flow data prediction, CNN and LSTM are used to fully mine the temporal and spatial characteristics of the data, so that the model can make more accurate judgments and improve the learning efficiency of the neural network. Experimental verification on real traffic flow data shows that the model proposed in this paper can effectively improve the accuracy of traffic flow prediction.

Reference | Related Articles | Metrics | Comments（0）

Select

Event Extraction Method Based on BERT-BiLSTM-Attention Hybrid Model

WEI Xin, HE Xiao-hai, TENG Qi-zhi, QING Lin-bo, CHEN Hong-gang

Computer and Modernization 2023, 0 (04): 26-31.

Abstract （484）

PDF（pc）（1263KB）（127）

Save

Event extraction is one of the basic tasks in the information extraction’s field， which is aims to extract structured information from unstructured text. The majority of the existing event extraction methods which are based on machine reading comprehension model directly detect and classify the input text trigger words， and to some extent ignore the prediction error caused by judging whether the input text is an event. Therefore， this paper proposes an event extraction method based on BERT-BiLSTM-Attention hybrid model. This method takes BERT-based machine reading comprehension model as the basic model， adopts multi-round question-and-answer method， and adds event classification detection module on the basis of existing machine reading comprehension model to reduce prediction error. BiLSTM model is combined with attention mechanism to form historical session information module to more effectively filter out important information and integrate it into a reading comprehension model. The event extraction experiments are conducted on ACE2005， and the results show that the accuracy， recall and F1 value are improved by 7.8 percentage points， 4.6 percentage points and 5.4 percentage points， respectively， compared with the basic model， which has certain advantages.

Reference | Related Articles | Metrics | Comments（0）

Select

Extended Isolated Forest Anomaly Detection Algorithm Based on Simulated Annealing

WANG Shi-yu, XIAO Li-dong, YAN Xin-chun, YING Wen-hao

Computer and Modernization 2023, 0 (01): 88-94.

Abstract （455）

PDF（pc）（1393KB）（128）

Save

Extended Isolation Forest （EIF） effectively solves the problem that Isolation Forest（iForest） is not sensitive to local abnormal points， but EIF replaces the isolated condition of axis-parallel with a hyperplane with random slope， which causes the algorithm model to lose part of the generalization ability， and increases time cost due to a large number of vector dot multiplication operations. In response to the above situation， an Extended Isolation Forest based on Simulated Annealing （SA-EIF） is proposed. The algorithm calculates the accuracy value and the difference value of each iTree （Isolation Tree） according to the prediction result of each iTree for the data set， then builds fitness function based on this. Finally， the iTree with better detection performance is selected by the simulated annealing algorithm to construct integrative learning model. The experimental results of K-fold cross-validation in the ODDS anomaly detection dataset indicate that the SA-EIF algorithm is sensitive to local anomalies， reducing the time cost by 20%~40% compared with EIF， and the recognition accuracy is about 5%~10% higher than EIF.

Reference | Related Articles | Metrics | Comments（0）

Select

Lightweight Vision Transformer Based on Separable Structured Transformations

HUANG Yan-hui, LAN Hai, WEI Xian

Computer and Modernization 2022, 0 (10): 75-81.

Abstract （448）

PDF（pc）（2702KB）（101）

Save

Due to a large number of parameters and high floating-point calculations of the Visual Transformer model, it is difficult to deploy it to portable or terminal devices. Because the attention matrix has a low-rank bottleneck, the model compression algorithm and the attention mechanism acceleration algorithm cannot well balance the relationship between the amount of model parameters, model inference speed and model performance. In order to solve the above problems, a lightweight ViT-SST model is designed. Firstly, by transforming the traditional fully connected layer into a separable structure, the number of model parameters is greatly reduced and the reasoning speed of the model is improved, and it is guaranteed that the attention matrix will not destroy the model’s expressive ability due to the appearance of low rank. Secondly, this paper proposes a Kronecker product approximate decomposition method based on SVD decomposition, which can convert the pre-training parameters of the public ViT-Base model to the ViT-Base-SST model. It slightly alleviates the overfitting phenomenon of the ViT-Base model and improves the accuracy of the model. Experiments on five common public datasets show that the proposed method is more suitable for the Transformer structure model than traditional compression methods.

Reference | Related Articles | Metrics | Comments（0）

Select

Traffic Accident Text Information Extraction Model Based on BERT and BiGRU-CRF Fusion

FAN Hai-wei, QIN Jia-jie, SUN Huan, ZHANG Li-miao, LU Xin-siyu

Computer and Modernization 2022, 0 (05): 10-15.

Abstract （444）

PDF（pc）（936KB）（104）

Save

Aiming at existing traffic accident text data has difficulties in effectively extracting a large number of key heterogeneous data such as time, place and casualty loss, and the accuracy of traffic accident text information extraction methods based on static word vector deep learning model is low. The BERT (Bidirectional Encoder Representations from Transformers) is used for a dynamic vector mapping of the text characters in order to resolve the problem of ambiguity and context dependence insufficient from the source of data representation. The vectored features of text are extracted by using BiGRU(Bi-Gate Recurrent Unit) and text sequences with high features are output. Based on CRF (Conditional Random Fields), the probabilistic advantage of the global optimal output node is calculated to optimize the feature results of text sequence, and a BERT-BiGRU-CRF fusion model based on dynamic word vector is proposed forextracting the key information of traffic accident text. The comparison experiment shows that the average accuracy of the model in traffic accident text information extraction is 0.952 and F1 is 0.925, and 6.3 percentage points and 7.9 percentage points higher respectively than those of the model based on static word vector Word2Vec.

Reference | Related Articles | Metrics | Comments（0）

Select

ERCUnet: An Improved Road Crack Detection Model Based on U-Net

LIU Yu-xiang, SHE Wei, SHEN Zhan-feng, TAN Shuai

Computer and Modernization 2022, 0 (07): 33-39.

Abstract （441）

PDF（pc）（3566KB）（141）

Save

Aiming at the problems of traditional road crack detection methods, such as low flexibility and poor universality, refering to the residual design in ResNet and the U-shaped encoding and decoding structure of U-Net model, an improved road crack detection model based on U-Net, named ERCUnet, is designed. The model takes residual blocks as the main body, and optimizes the number of convolution cores of convolution layers at different depths for crack detection. All residual blocks in the model have the same structure, the overall structure of the model is more neat and simple, with good elasticity and strong structure. The residual structure not only makes the feature fusion more sufficient but also avoids the problem of gradient disappearance of deep convolution neural network. The experiment is conducted on the CrackForest dataset. The 118 labeled pictures of CrackForest are divided into training set and testing set according to the ratio of 5〖DK(〗∶〖DK)〗1. Through a series of data expansion methods, the problem of too little training data is effectively alleviated. The loss function combines cross entropy and F1 score to alleviate the imbalance between positive and negative samples. The final experimental results show that the number of parameters of ERCUnet model is only 13.30% of that of U-Net model, the recall, precision, and F1 are all greater than 70%, and noise rate and accuracy are 29.05% and 99.01% on testing set. ERCUnet-tiny model is obtained by modifying model parameters to confirm the plasticity of ERCUnet, and the number of its parameters is only 2.39% of that of U-Net model, similar effect to U-Net is achieved on testing set.

Reference | Related Articles | Metrics | Comments（0）

Select

High Illumination Visible Image Generation Based on Generative Adversarial Networks

ZHUANG Wen-hua, TANG Xiao-gang, ZHANG Bin-quan, YUAN Guang-ming

Computer and Modernization 2023, 0 (01): 1-6.

Abstract （434）

PDF（pc）（7034KB）（130）

Save

To solve the problem of low accuracy of target detection under low illumination conditions at night， this paper proposes a generative adversarial network-based algorithm for high illumination visible light image generation. To improve the ability of the generator to extract features， a CBAM attention module is introduced in the converter module； To avoid the noise interference of artifacts in the generated images， the decoder of the generator is changed from the deconvolution method to the up-sampling method of nearest neighbour interpolation plus convolution layer； to improve the stability of the network training， the adversarial loss function is replaced from the cross-entropy function to the least-squares function. The generated visible images have the advantages of spectral information， rich detail information and good visibility enhancement compared with infrared images and night visible images， which can effectively obtain information about the target and scene. We verified the effectiveness of the method by image generation metrics and target detection metrics respectively， in which the mAP obtained from the test on the generated visible image improved by 11.7 percentage points and 30.2 percentage points respectively compared to the infrared image and the real visible image， which can effectively improve the detection accuracy and anti-interference capability of nighttime targets.

Reference | Related Articles | Metrics | Comments（0）

Select

Fault Diagnosis of Pumping Unit Based on 1D-CNN-LSTM Attention Network

WANG Lei, ZHANG Xiao-dong, DAI Huan

Computer and Modernization 2023, 0 (04): 1-6.

Abstract （432）

PDF（pc）（1482KB）（124）

Save

Aiming at the problems of complex feature extraction， large amount of model parameters and low diagnostic efficiency in traditional fault diagnosis methods of pumping unit based on dynamometer diagram， this paper proposes a fault diagnosis method based on 1D-CNN-LSTM attention network. The dynamometer diagram is converted into a load displacement sequence as the network input， the one-dimensional convolutional neural network （1D-CNN） is used to extract local features of the sequence while reducing sequence length. Considering the temporal characteristics of the sequence， the long-short-term memory （LSTM） network is further used to extract temporal features of the sequence. In order to highlight the impact of key features， the attention mechanism is introduced to give higher attention weights to temporal features related to fault type. Finally， the weighted features are input into a fully connected layer， and the Softmax classifier is used to realize fault diagnosis. The experimental results show that the average accuracy， precision， recall and F1 value of the proposed method reach 99.13%， 99.35%， 99.17% and 99.25%， respectively， and the model size is only 98 kB. Compared with other methods based on feature engineering， it has higher diagnostic accuracy and generalization. Compared with other methods based on two-dimensional convolutional neural network （2D-CNN） model， it significantly reduces model parameters and training time， improves the efficiency of fault diagnosis.

Reference | Related Articles | Metrics | Comments（0）

Select

A Non-intrusive Load Monitoring Method Based on Improved kNN Algorithm and Transient Steady State Features

TIAN Feng, DENG Xiao-ping, ZHANG Gui-qing, WANG Bao-yi

Computer and Modernization 2022, 0 (10): 29-35.

Abstract （428）

PDF（pc）（1558KB）（114）

Save

Non-intrusive load monitoring (NILM) can obtain the operation data of the electrical appliance in the circuit by analyzing the record from a single energy meter, which can serve as an important tool for energy saving planning and optimal dispatching for power grid. The existing NILM methods mainly focus on improving the accuracy of load identification, the model complexity is too high to be applied on embedded devices. A NILM method based on improved kNN algorithm and transient steady state feature is proposed to solve the above problems. Firstly, the kNN algorithm is selected as the load identification model because it does not require training, the kNN algorithm is improved by statistical method of distance weight, and the cosine similarity judgment mechanism is added to verify the accuracy of the kNN load identification results. Secondly, the transient and steady state features are selected as load characteristics to improve the identification of load features. Finally, experimental data are used to verify that the above NILM method has superior performance.

Reference | Related Articles | Metrics | Comments（0）

Select

Stock Volatility Prediction of LightGBM-GRU Model under Corrective Learning Strategy

SHI Zhi-wei, WU Zhi-feng, ZHANG Zhe

Computer and Modernization 2023, 0 (01): 95-102.

Abstract （425）

PDF（pc）（1925KB）（150）

Save

In order to improve the accuracy of traditional intelligent algorithms in time series prediction and the adaptability of solving engineering data problems， a corrective learning strategy is proposed. Volatility is widely used in the financial field， so it is of great value to predict the volatility of stocks. Since the time series of stock prices are non-linear and non-stationary， predicting the volatility of the stock market has become a difficult point in time series forecasting. In this paper， a simulation experiment is carried out by corrective learning strategy， and a LightGBM-GRU model is designed. Using LightGBM and GRU as the base model and corrector， we predict the volatility of 126 stocks from different industries in the next 10 minutes within 3 years. According to RMSPE，MAE，MSE，RMSE and other indicators： even the classical integrated learning model with good effect， the accuracy and generalization ability also can be improved at the same time by the corrective learning strategy. This paper points out that in the era of algorithm enrichment and big data， the contradiction of intelligent algorithms has turned into a contradiction between the limited versatility of intelligent algorithms and the diversity of engineering problems. Correcting learning strategies can provide new ideas for data simulation.

Reference | Related Articles | Metrics | Comments（0）

Select

Parallax Image Stitching Algorithm Based on GMS and Improved Optimal Seam

LI Si-jie, TANG Qing-shan, GAO Ying-hua

Computer and Modernization 2022, 0 (12): 95-101.

Abstract （420）

PDF（pc）（7870KB）（99）

Save

Aiming at the problems of ghost and uneven brightness in parallax image stitching, this paper proposes a parallax image stitching algorithm based on grid motion statistics（GMS） and improved optimal seam. Firstly, the fast feature extraction and description（ORB） algorithm is used to extract feature points and the GMS algorithm is used to screen out the mismatched points. Then, HSV color space and image gradient difference are introduced to improve the energy function to avoid the stitching line passing through the image edge. Based on the graph cutting method, the optimal seam is obtained, and the gradient fusion stitching of the image is carried out. The simulation results show that, in the case of large disparity, compared with the algorithm based on scale feature invariance（SIFT） and the algorithm based on accelerated robustness feature（SURF）, the accuracy of feature point matching of this algorithm is increased by 2.01 times and 4.73 times at the lowest and highest, and the image naturalness is increased by 20.6% on average. Moreover, the stitched image has uniform brightness and no perspective distortion.

Reference | Related Articles | Metrics | Comments（0）

Select

Micro-expression Recognition Based on AU-GCN and Attention Mechanism

ZHAO Jing-hua, YANG Qiu-xiang

Computer and Modernization 2023, 0 (03): 48-53.

Abstract （412）

PDF（pc）（1458KB）（108）

Save

As a kind of expression with very short duration， micro-expression can implicitly express people ’s true feelings of trying to suppress and hide， which has a good application in national security， judicial system， medical category and political elections. However， since micro-expression has the characteristics of less data sets， short duration and low expression amplitude， there are many difficulties in identifying micro-expressions， such as less data samples， larger calculation， lack of attention to key features， and easy to over-fitting. Therefore， this paper uses facial action units （ AU ） to highlight local features by weighted attention mechanism， and applies graph convolution network to find the dependencies between AU nodes， and aggregates them into global features for micro-expression recognition. The experimental results show that compared with the existing methods， the proposed method improves the accuracy to 79.3 %.

Reference | Related Articles | Metrics | Comments（0）

Select

Anti-collision Shortest Path Planning Based on Improved Dijkstra Algorithm

HUANG Yi-hu, YU Ya-nan

Computer and Modernization 2022, 0 (08): 20-24.

Abstract （404）

PDF（pc）（781KB）（132）

Save

Multiple UAVs may face the contradiction of track conflict when performing operational tasks. Therefore, an improved Dijkstra algorithm is proposed to realize the function of multiple UAVs to find the shortest and non-conflicting route. In the process of searching and traversing each track node by the classical Dijkstra algorithm, the variable length backtracking array of precursor nodes of each node is introduced to record all precursor nodes contained in each node, and all feasible shortest length routes from the starting point to the target point of each task are found. Then the time window conflict judgment model is introduced to separate the non-conflicting routes from all feasible routes of each task. Once all routes conflict, the conflict node in one of the shortest routes is treated as a temporary obstacle point and the shortest route that does not conflict with other tasks is re found by changing the backtracking array. Matlab software is used to design and write programs to verify the algorithm. The experiments show that the improved algorithm can plan all the shortest and non-conflicting routes contained in each task when multiple UAVs perform operational tasks and the planning efficiency of task set has been significantly improved.

Reference | Related Articles | Metrics | Comments（0）

Select

Vehicle Target Detection Algorithm Based on YOLO v4

YIN Yuan-qi, XU Yuan, XING Yuan-xin

Computer and Modernization 2022, 0 (07): 8-14.

Abstract （397）

PDF（pc）（2959KB）（141）

Save

Aiming at the problems of low occlusion target detection accuracy and poor small target detection effect in vehicle target detection, an improved target detection algorithm YOLO v4-ASC based on YOLO v4 is proposed. By adding convolution block attention module to the tail of the backbone extraction network, the feature expression ability of the network model is improved; The loss function is improved to improve the convergence speed of the network model, and the Adam+SGDM optimization method is used to replace the original model optimization method SGDM to further improve the model detection performance. In addition, K-Means clustering algorithm is used to optimize the priori frame size, and the car, truck and bus categories in the traffic scene data set are combined as vehicle, which simplifies the problem in this paper into a two classification problem. The experimental results show that on the basis of maintaining the detection speed of the original algorithm, the proposed YOLO v4-ASC target detection algorithm achieves 70.05% AP and 71% F1-score. Compared with the original YOLO v4 algorithm, AP is improved by 9.92 percentage points and F1 score is improved by 9 percentage points.

Reference | Related Articles | Metrics | Comments（0）

Select

Review of Fall Detection Technologies for Elderly

WANG Mengxi, LI Jun

Computer and Modernization 2024, 0 (08): 30-36. DOI: 10.3969/j.issn.1006-2475.2024.08.006

Abstract （396）

PDF（pc）（2530KB）（231）

Save

With the rapidly growing aging population in China， the proportion of the elderly living alone has significantly increased， and thus the aging-population-oriented facilities have received increased attention. In a domestic environment， the elderly are likely to fall down due to different reasons such as lack of care， aging， and sudden illness， which have become one of the main threats to their health. Therefore， monitoring， detecting and predicting fall down behavior of the elderly in real-time can ensure their safety to some extent， while further reducing the life and health risks caused by accidental falling down. Based on a comprehensive overview of the research on human fall detection， we categorize fall detection into two categories： vision-free technologies and computer vision based methods， depending on different kinds of sensors used for data acquisition. We summarize and introduce the system composition of different methods and explore the latest relevant research， and discuss their method characteristics and practical applications. In particular， we focus on reviewing the deep learning based schemes which have been developing rapidly in recent years， while analyzing and discussing relevant principles and research results of deep learning based schemes in details. Next， we also introduce public benchmarking datasets for human fall detection， including dataset size and storage format. Finally， we discuss the prospect for the relevant research， and come up with reasonable suggestions in different aspects.

Reference | Related Articles | Metrics | Comments（0）

Select

Review of Abnormal Service Data Detection Methods in Power Grid

BAI Kai-feng, ZHAO Hong-bin, ZHANG Yun, LI Yan, CUI Jing-an, LIU Qian-jin, YANG Hua, NI Na

Computer and Modernization 2023, 0 (03): 79-83.

Abstract （391）

PDF（pc）（1027KB）（103）

Save

The concept of smart grid has promoted the development of grid intelligence and informatization. The amount of various types of business data generated by the power system has also increased exponentially， and the abnormal data therein has decisive influnce on the analysis of power data and the stability of grid operation to a large extent. This article classifies， aralyzes and summarizes the methods for detecting abnormal business data in the power grid. The methods for detecting abnormal business data based on traditional technology and artificial intelligence technology are introduced respectively， and the basic principles and characteristics of each method are analyzed and explained. The challenges and future development trends of abnormal business data detection in power grids are summarized and prospected. This article which provides a certain reference for follow-up research.

Reference | Related Articles | Metrics | Comments（0）

Select

Government Hotline Work-order Classification Fusing RoBERTa and Feature Extraction

CHEN Gang

Computer and Modernization 2022, 0 (06): 21-26.

Abstract （390）

PDF（pc）（1297KB）（124）

Save

Government hotlines undertake a large number of citizens’ demands, which make manual work-order classification time-consuming and laborious. Most of the existing work-order classification methods are based on machine learning or single neural network model. With these methods, it is difficult to effectively understand the context semantic information, and the text feature extraction is not comprehensive. A government hotline work-order classification method fusing RoBERTa and feature extraction is proposed to address the above problems. The proposed method firstly obtains context-aware semantic feature vectors from textual descriptions of work-orders by RoBERTa pre-trained language model. Then, a feature extraction layer based on convolution neural network, bidirectional gated recurrent unit and Self-Attention mechanism is constructed to obtain the local and global features of the work-order semantic encodings, with the process of highlighting the semantic features with great importance for the global features. Finally, the fused feature vectors are input into the classifier to finish work-order classification. Experimental results show the proposed method can achieve better classification performance compared with several baseline methods.

Reference | Related Articles | Metrics | Comments（0）

Top Read Articles