Computer and Modernization

A Text Classification Model Based on BERT and Pooling Operation

ZHNAG Jun, QIU Long-long

2022, 0(06): 1-7.

Asbtract ( 948 )

PDF (948KB) ( 197 )

References | Related Articles | Metrics

The fine-tuning method using the pre-trained language model has achieved good results in many natural language processing tasks represented by text classification, BERT model based on the Transformer framework as a typical representative especially. However, BERT uses the vector corresponding to ［CLS］ as the text representation directly, and does not consider the local features and global features of texts, which limits the classification performance of the model. Therefore, this paper proposes a text classification model that introduces a pooling operation, and uses pooling methods such as average pooling, maximum pooling, and K-MaxPooling to extract the representation vector of texts from the output matrix of BERT. The experimental results show that compared with the original BERT model, the text classification model with pooling operation proposed in this paper has better performance. In all text classification tasks in the experiment, its accuracy and F1-Score value are better than BERT model.

Teachers’ Scientific Research Performance Prediction Based on PSO-KPLS

HUANG Ling, BAI Xiao-bo

2022, 0(06): 8-12.

Asbtract ( 262 )

PDF (984KB) ( 94 )

References | Related Articles | Metrics

To resolve the problem that university teachers’ performance of science research is difficult to predict, firstly, the optimization target of model is built and PSO-KPLS algorithm is addressed, which base idea is that particle swarm optimization (PSO) algorithm is used as the optimization algorithm and root mean square error (RMSE) is used as the convergence criterion, which are utilized to optimize kernel based partial least squares（KPLS） . It is used to instead of finding suitable parameter by manual operation. Then, the multi-dimensional feature vector is used to express teachers’ research performance, and the comprehensive method is used to calculate their research score. Finally, taking the data of 6 years of 60 teachers as samples, the model is trained by PSO-KPLS and fitted, and the influence of accuracy requirements on the efficiency of PSO-KPLS is mainly explored. Through the comparative experiment with other optimized KPLS algorithms, the results show that PSO-KPLS can accurately predict teachers’ research performance in the next two or three years.

An Improved Whale Optimization Algorithm Base on Hybrid Strategy

LI Ru, FAN Bing-bing

2022, 0(06): 13-20.

Asbtract ( 659 )

PDF (1185KB) ( 151 )

References | Related Articles | Metrics

In order to solve the problems of the original whale optimization algorithm (WOA) with slow convergence speed, weak global search ability, low solution accuracy and easy to fall into local optimization, a hybrid strategy is proposed to improve the whale optimization algorithm (LGWOA). Firstly, the Levy flight strategy is introduced into the position update formula of the whale random search, and the global search step is increased through Levy flight, the search space is enlarged, and the global search capability is improved. Secondly, the adaptive weight is introduced into the whale spiral upward position update formula to improve the algorithm’s local search ability and optimization accuracy. Finally, the idea combining the genetic algorithm’s cross mutation is used to balance the algorithm’s global search and local search capabilities, maintain the diversity of the population, and avoid falling into the local optimum. Simulation experiments on 12 benchmark test functions in different dimensions show that the improved whale algorithm has faster convergence speed and higher optimization accuracy.

Government Hotline Work-order Classification Fusing RoBERTa and Feature Extraction

CHEN Gang

2022, 0(06): 21-26.

Asbtract ( 408 )

PDF (1297KB) ( 130 )

References | Related Articles | Metrics

Government hotlines undertake a large number of citizens’ demands, which make manual work-order classification time-consuming and laborious. Most of the existing work-order classification methods are based on machine learning or single neural network model. With these methods, it is difficult to effectively understand the context semantic information, and the text feature extraction is not comprehensive. A government hotline work-order classification method fusing RoBERTa and feature extraction is proposed to address the above problems. The proposed method firstly obtains context-aware semantic feature vectors from textual descriptions of work-orders by RoBERTa pre-trained language model. Then, a feature extraction layer based on convolution neural network, bidirectional gated recurrent unit and Self-Attention mechanism is constructed to obtain the local and global features of the work-order semantic encodings, with the process of highlighting the semantic features with great importance for the global features. Finally, the fused feature vectors are input into the classifier to finish work-order classification. Experimental results show the proposed method can achieve better classification performance compared with several baseline methods.

A Multi-attribute Decision Making Method Based on Information Measure

WEI Li-jun, WU Hai-bo, ZHANG Ruo-bing

2022, 0(06): 27-31.

Asbtract ( 240 )

PDF (626KB) ( 128 )

References | Related Articles | Metrics

Single-valued neutrosophic sets (SVN) can be used to represent the uncertainty and inconsistent information in the real situation. Information measure plays an important role in the theory of support vector network, and has attracted more and more attention in recent years. In this paper, a multi-attribute decision-making method based on single-valued neutrosophic information measure is proposed. This paper first introduces three axiomatic definitions of information metrics. It includes entropy, similarity measure and cross-entropy. Then, based on cosine function, the information measure formula is constructed, and the relationship and transformation among entropy, similarity measure and cross-entropy are discussed. On this basis, a multi-attribute decision-making method based on information measure formula is proposed. Finally, a numerical example of urban pollution assessment is given. The applicability and effectiveness of this method are demonstrated.

Character Network Analysis of Ordinary World

WANG Jun, HE Jin-rong, MA Le-rong

2022, 0(06): 32-36.

Asbtract ( 696 )

PDF (1405KB) ( 168 )

References | Related Articles | Metrics

The construction and quantitative analysis of the relationship network of characters in literary works is an important content of intelligent interpretation of literary works. This article takes Mr. Lu Yao’s literary work "The Ordinary World" as the research object and uses complex network analysis methods to construct and analyze the social network in the literary works. Firstly, the social network relationship in the work is extracted, where the characters in the novel correspond to the nodes in the network, the relationships between the characters correspond to the edges of the network, and the number of times the characters appear together in each chapter corresponds to the weight of the edges. Then we analyze the betweenness and aggregation coefficient correlation， hierarchical clustering and predicting link on the constructed network. The experimental results show that the character relationship network of Ordinary World is a heterogenous network with small-world characteristics. This research is helpful to promote the analysis of character relationship network in literary works.

Microblog Rumor Detection Integrating User’s History and Dissemination Information

LU Yue, CAO Chun-ping

2022, 0(06): 37-42.

Asbtract ( 287 )

PDF (1203KB) ( 127 )

References | Related Articles | Metrics

With the development of Internet technology, online rumors have gradually spread on social media platforms based on Weibo. Research on the automatic detection of Weibo rumors is of great significance to maintaining social stability. The current mainstream rumor detection methods based on deep learning generally have the problem of not fully considering the semantic information of Weibo texts. At the same time, the rumor detection methods that rely too much on dissemination of information make the detection time lag and cannot meet the actual needs of rumor detection. In response to the above problems, this paper proposes a microblog rumor detection model that integrates user historical interaction information. It does not use the dissemination information of microblogs to be detected, constructs and trains the AbaNet (ALBERT-BiGRU-Attention) deep learning network model, and fully considers the text features and semantic information of Weibo and user history dissemination information text for rumor detection. The experimental results show that the model in this paper has the characteristics of high accuracy and strong stability, and can greatly shorten the time of rumor detection while obtaining high detection accuracy.

Host Matching for C2C Online Short-term Rentals

WU Dai-yang, ZHAO Jie, LIANG Jia-ming, DONG Zhen-ning, LIANG Zhou-yang

2022, 0(06): 43-48.

Asbtract ( 212 )

PDF (1419KB) ( 120 )

References | Related Articles | Metrics

With the rise of homestays and online short-term rental platforms, the phenomenon of host multiple ownership continues to receive attention and research. This phenomenon provides a new research perspective, and how to identify same-source hosts on different platforms has become the first problem to be solved. Therefore, this article explores the C2C online short-term rental cross-platform host matching algorithm based on traditional user matching. Among them, due to the sparse personal information of the host, this paper introduces housing information and designs a two-stage host matching algorithm (TSHM) based on housing. The method in this paper achieves 99.69% and 81.97% accuracy on the common data set and the hard-case data set based on the real data of the two domestic online short-term rental platforms, respectively, which is better than traditional classifiers such as SVM and DT. The matching model is verified. The effectiveness of the matching features provides a new idea for cross-platform host matching, which can still effectively match the host even if the host’s personal information is lacking. However, this article only conducts experiments on domestic platform data, and does not introduce features such as text and pictures, which has certain limitations.

R&D Collaborative Management of Meteorological Model in High Performance Computing

ZHAO Chun-yan, SUN Jing, HU Jiang-kai, ZHOU Bin

2022, 0(06): 49-55.

Asbtract ( 241 )

PDF (1683KB) ( 100 )

References | Related Articles | Metrics

The process of numerical prediction model research and development is a multi-disciplinary complex system engineering. As Moore’s law approaches its limit , exascale computing is coming and with the trend of earth system, the R & D collaboration of meteorological numerical model faces more complex collaboration, more professional computing platform to debug, wider shared demand challenges. In view of the above challenges, application research has been carried out around the solution of collaborative management, R & D process, code integration and sharing, model debug and experiment test-bed in high-performance computing environment with Git, Python and workflow to improve the collaborative efficiency and operational efficiency. The results show that this research standardizes the R & D process and achievement management, supports the convenient R & D management, debugging and analysis, improves the efficiency of upgrading and operation, it can provide reference for large-scale traditional scientific research on high performance computing.

Research Progress of Text Summarization Model

ZHANG Zi-yun, WANG Wen-fa, MA Le-rong, DING Cang-feng

2022, 0(06): 56-66.

Asbtract ( 374 )

PDF (1415KB) ( 152 )

References | Related Articles | Metrics

With more and more text data generated by the Internet, the problem of text information overload is becoming more and more serious. It is very necessary to reduce the dimension of various texts, and text summarization is one of the important means, and it is also one of the hot and difficult points in the field of artificial intelligence research. Text is designed to transform a text or a collection of texts into a short summary containing key information. In recent years, language model preprocessing has improved the technical level of many natural language processing tasks, including emotion analysis, question and answer, natural language reasoning, named entity recognition, text similarity and text summarization. In this paper, the classic methods of text summarization in the past and the methods of text summarization based on pre-training in recent years are combed, and the data sets and evaluation methods of text summarization are sorted out. Finally, the challenges and development trends of text summarization are summarized.

Driver Distracted Behavior Recognition Based on Deep Learning

HE Li-wen, ZHANG Rui-chi

2022, 0(06): 67-74.

Asbtract ( 576 )

PDF (2347KB) ( 170 )

References | Related Articles | Metrics

Distracted driving behavior recognition is one of the main methods to improve driving safety. Aiming at the problem of low identification accuracy of distracted driving behavior, this paper proposes a driver distracted behavior recognition algorithm based on deep learning, which is composed of a cascade of target detection network and precise behavior recognition network. Based on the State Farm open data set, in the first level, the target detection algorithm SSD (Single Shot Multibox Detector) is used to extract local information from the original driver images in the data set and determine the candidate regions for behavior recognition. Then in the second level, the transfer learning VGG19, ResNet50 and MobileNetV2 models is used to accuratelyidentify the behavior information in the candidate region. Finally, the experiment compares the recognition accuracy of distracted driving behavior between layered recognition architecture and single model architecture. Results show that compared the proposed cascade network model with the mainstream model of single detection method, the driver behavior identification accuracy is improved 4% ~ 7% overall. Besides, the proposed algorithm not only reduces the influence of noise and other background regions on the model to improve the accuracy of distracted behavior recognition, but also can effectively identify more behavior categories to avoid the misclassification of actions.

Prediction of Cardiovascular Disease Based on Improved Deep Neural Network

LIU Yu-hang, QU Yuan, XU Ying-hao, ZHU Xi-jun, YU Yan

2022, 0(06): 75-79.

Asbtract ( 288 )

PDF (969KB) ( 143 )

References | Related Articles | Metrics

Cardiovascular disease is a common disease threatening human health. In order to predict it more accurately, this paper optimizes and improves the traditional DNN model and proposes a directional regular deep neural network (TR-DNN) model. By improving the defects of the original deep neural network model, it can better train and test the cardiovascular disease data set, further realize the task of cardiovascular disease prediction. Experiments show that the model performs well in data set training, and achieves excellent results in test set. Finally, comparing the results of TR-DNN with SVM, RF and XGBoost models in the same data set, the evaluation indexes of TR-DNN model are better than other models. Compared with the traditional DNN model, TR-DNN model improves the accuracy by 1.507 percentage points, the recall by 1.57 percentage points, the specificity by 2.54 percentage points and the precision by 1.51 percentage points. Therefore, TR-DNN model can be applied to the prediction of cardiovascular disease.

A Data-driven Deep Modulation Identification Method for RF Signals

XU Ya-jun, GUO En-hao, CHEN Lin, SI Cheng-ke

2022, 0(06): 80-86.

Asbtract ( 182 )

PDF (2427KB) ( 99 )

References | Related Articles | Metrics

The identification performance with convolutional neural network (CNN) is limited with the types of signal modulation identification. For instance, the identification accuracy is just only 80% when 24 kinds of modulation waveforms presented at SNR=12 dB. If a better recognition performance wants to be obtained, more complicated network must be required. It directly enlarges the requirement of the data set size and the cost of hardware calculation resources is also increased. Therefore, a compact residual neural network for radio signal modulation identification is designed in the paper, which can be used to extract the characteristics of signal modulation. The end-to-end identification is accomplished from the baseband in-phase and quadrature components. By using transfer learning, the number of samples in the network retraining stage is reduced dramatically and the adaptive ability of the proposed network is enhanced. The test results show that the identification performance with the proposed neural network approaches 95% when the SNR is 12 dB even though the wireless channel impulse response verified. Several comparative experiments illustrate the advantages of the proposed neural network.

Survey of Fruit Object Detection Algorithms in Computer Vision

LI Wei-qiang, WANG Dong, NING Zheng-tong, LU Ming-liang, QIN Peng-fei

2022, 0(06): 87-95.

Asbtract ( 632 )

PDF (1939KB) ( 220 )

References | Related Articles | Metrics

Fruit target detection and recognition based on computer vision is an important cross-disciplinary research topic of target detection, computer vision, agricultural robots, etc. It has important theoretical research significance and practical application value in the fields of smart agriculture, agricultural modernization, and automatic picking robots. As deep learning is widely used in the field of image processing and has achieved good results, fruit target detection and recognition algorithms combining computer vision technology with deep learning methods gradually become the mainstream. This article introduces the tasks, difficulties and development status of fruit target detection and recognition based on computer vision, as well as two types of fruit target detection and recognition algorithms based on deep learning methods. Finally, the public data set used for the training and learning of the algorithm model and the evaluation index for evaluating the performance of the model are introduced, and the current problems in the detection and recognition of fruit targets and the possible future development directions are discussed.

Substation Monitoring Picture Recognition Algorithm for Automatic Human-machine Interface Verification

ZHAO Na, LIU Wen-biao, WANG Lian-tao, WANG Meng-ru, REN Zhen-xing

2022, 0(06): 96-103.

Asbtract ( 266 )

PDF (3136KB) ( 107 )

References | Related Articles | Metrics

When testing and verifying the man-machine interface of substation monitoring system, it is common to assess whether the monitoring software is up to standard by comparing the monitoring picture observed by the human eye with the information sent by the test command, but the accuracy and efficiency of the human eye in observing the complex and variable monitoring information is not guaranteed. In this paper, we design a method to automatically identify information on substation monitoring pictures using image processing and machine learning techniques. A template matching method based on the best primitive is proposed to solve the problem of automatic positioning of electrical primitive in the picture.The FHOG operator is proposed to describe the topological features of the picture and speed up the recognition of the monitoring pictures and primitives. For problems such as the separation of the left and right body structure of Chinese characters and the sticking of characters in the warning message picture, an algorithm for segmentation and recognition of synergies is proposed to locate characters and deep convolutional neural networks are used for recognition. The effectiveness of the method is verified in the actual substation monitoring pictures. We also design an online verification system, obtaining the recognition accuracy of 96.04%.

Helmet-wearing Detection Based on Improved YOLOv5

YUE Heng, HUANG Xiao-ming, LIN Ming-hui, GAO Ming, LI Yang, CHEN Ling

2022, 0(06): 104-108.

Asbtract ( 600 )

PDF (2508KB) ( 179 )

References | Related Articles | Metrics

To the problem that YOLOv5 cannot be focused by weights and cannot produce more distinguishable features, thereby reducing the accuracy of helmet detection, attention module was used. Besides, squeeze and excitation layer and efficient channel attention module were studied. To the problem that the non maximum suppression used by YOLOv5 to remove redundant results will only retain the highest confidence prediction frame of the same class when objects were highly overlapped, the Soft-NMS algorithm was used to keep more prediction boxes. Weighted non maximum suppression was used to fuse multiple prediction boxes information to improve the accuracy of the prediction boxes. For the problem of information loss caused by down-sampling , focus modules was used to improve the detection effect, and the various modules were integrated to obtain the optimal FESW-YOLO algorithm. Compared with YOLOv5, the algorithm improves the mAP@0.5 by 2.1 percentage points and the mAP@0.5:0.95 by1.2 percentage points on the helmet data set respectively, which improves the accuracy of safety helmet supervision.

Object Detection in Remote Sensing Images Based on Software and Hardware Co-acceleration Framework

TAN Jin-lin, FAN Wen-tong, LIU Ya-hu, LIANG Zhi-feng, WANG Liang, LIU Bin, HUANG Bin

2022, 0(06): 109-115.

Asbtract ( 297 )

PDF (2195KB) ( 123 )

References | Related Articles | Metrics

Due to the rapid increase of computational complexity and memory requirement in the field of object detection in remote sensing images, it is quite difficult to be applied to the embedded platform with small size and low power. To address aforementioned issues, a hardware and software co-acceleration framework based on field-programmable gate array (FPGA) to promote the inference process of object detection in remote sensing images is proposed. Firstly, the trained YOLOv3 network are compressed and compiled according to the Vitis AI acceleration scheme. And then, the underlying hardware project including deep learning processing unit (DPU) module is built on FPGA, and the DPU task scheduler is written on ARM. Finally, the inference acceleration based on FPGA is implemented on Zynq SoC development platform. Experimental results show that our framework achieves an average throughput rate of 1.75 TOPS (26.8 fps) on the Xilinx Zynq MPSoC, and the mean Average Precision (mAP) on DIOR dataset is 56.7%.

Network Intrusion Detection Model Based on Space-time Feature Fusion and Attention Mechanism

RAO Hai-bing, ZHU Su-lei, YANG Chun-xia

2022, 0(06): 116-121.

Asbtract ( 331 )

PDF (6398KB) ( 116 )

References | Related Articles | Metrics

Aiming at the problem of low network intrusion detection performance, a deep learning intrusion detection model CTA-net based on space-time feature fusion and attention mechanism is proposed. The model obtains space-time fusion features by integrating convolutional neural network (CNN) and long-short-term memory network (LSTM), and then uses the attention module (Attention) to calculate the importance of the input space-time fusion features, and finally passes the softmax function sort. Using the NSL-KDD data set, the experimental results show that compared with the CNN model with similar structure and the space-time fusion CNN-LSTM model, the convergence of the training set is significantly improved, and the accurate of classification evaluation index used on the test set has increased by 10.9120 percentage points and 11.8740 percentage points, the precision has increased by 9.1950 percentage points and 9.6130 percentage points, the recall has increased by 9.1780 percentage points and9.9340 percentage points, and F1-SCORE has increased by 10.7830 percentage points and 11.750 percentage points . The simulation results show that the proposed CTA-net model has good application potential in network intrusion detection.

A SQL Injection Attack Detection Algorithm Based on Improved TF-IDF

GUAN Hui, SHENG Jing-yuan, CAO Tong-zhou

2022, 0(06): 122-126.

Asbtract ( 343 )

PDF (673KB) ( 133 )

References | Related Articles | Metrics

Because the traditional TF-IDF algorithm does not allocate the weight of feature words well, there will be problems of insufficient feature extraction and low efficiency, resulting in the results not in line with the actual situation. In order to solve the limitations of this method in SQL injection attack detection, this paper improves TF-IDF by adding text quantity ratio factor and Chi statistics to the traditional TF-IDF algorithm, which can well improve the weight of some important words. The detection of SQL injection attacks is realized by selecting different classifiers, so as to obtain different classification results. The experimental results show that the combination of boosted decision tree and improved TF-IDF has higher accuracy, recall and F1 value than other similar methods. In addition, compared with the traditional TF-IDF algorithm, the correctness, accuracy, recall and F1 value of the proposed algorithm are improved by about 5%, which has a certain practical application value.

Table of Content