Computer and Modernization

FOCoR: A Course Recommendation Approach Based on Feature Selection Optimization

WANG Yang, CHEN Mei, LI Hui

2022, 0(10): 1-7.

Asbtract ( 436 )

PDF (1330KB) ( 90 )

References | Related Articles | Metrics

To solve the cold start problem of the recommendation model based on the behavioral log from online education platform, we design a course recommendation method named FOCoR that integrates data of course selection. First, we propose a technology of feature selection based on genetic algorithm (FSBGA), and then take the result of feature selection as input to build a recommendation model based on LightGBM which is a technology of gradient boosting tree for course recommendation. To be more specific, we construct a fitness function combining the loss of model and the number of features in the proposed FSBGA so that we successfully searched out the optimal feature subset that takes into account the loss of model and the number of features in the feature subset space of university course selection data. According to three indicators of log loss, F1-score and AUC, the model of course selection trained on the feature subset selected by the FSBGA is better than the models trained on the others selected by algorithms based on mutual information or F-test. In order to verify the effectiveness of the work in this paper, we have tested and evaluated FOCoR, LightGBM, XGBoost, decision tree, random forest, logistic regression and other algorithms on real data sets, and the results show that FOCoR has achieved the best performance in F1 scores.

Text Classification Based on ALBERT Combined with Bidirectional Network

HUANG Zhong-xiang, LI Ming

2022, 0(10): 8-12.

Asbtract ( 392 )

PDF (906KB) ( 113 )

References | Related Articles | Metrics

Aiming at the defect that the current multi-label text classification algorithms cannot effectively utilize the deep text information, we propose a model——ABAT. The ALBERT model is used to extract the features of the deep text information, and the bidirectional LSTM network is used for feature training, and the attention mechanism is used to enhance the classification effect to complete the classification. Experiments are carried out on the DuEE1.0 data set released by Baidu. Compared with each comparative model, the performance of the model reaches the best, Micro-Precision reaches 0.9625, Micro-F1 reaches 0.9033, and the model’s Hamming loss drops to 0.0023. The experimental results show that the improved ABAT model can better complete the task of multi-label text classification.

Climate Change Prediction in Canada Based on VAR Model

KOU Lu-yan, LIAO Jing, LI Xue-jun, WU Chang-shu, XIONG Jian-hua

2022, 0(10): 13-18.

Asbtract ( 390 )

PDF (2673KB) ( 99 )

References | Related Articles | Metrics

The melting of Antarctic glaciers, the increasing of hurricanes and the gradual rising of sea level make people aware of the great challenges caused by global warming. So it is necessary to do research on global climate change. Missing data imputation is taken to study the data of four representative provinces in Canada, and a vector autoregressive (VAR) model is established considering the factors of solar radiation intensity, carbon dioxide content, soil water content, temperature, rainfall etc. to study Canada’s climate change. The specific model is established by doing stability test, impulse response and variance analysis, and is used to predict the temperature and precipitation in Canada. The experimental results show that the average temperature in Canada in the next 25 years will reach 15.0410 ℃, and the average precipitation will reach 2.0950 mm.

An Outlier Detection Algorithm Based on Neighborhood Granular Entropy

DUAN Xun, YANG Zhi-yong, JIANG Feng

2022, 0(10): 19-23.

Asbtract ( 284 )

PDF (673KB) ( 86 )

References | Related Articles | Metrics

Outlier detection is one of the important research directions in the field of data mining. Its purpose is to find out a small portion of data in the data set that is significantly different from other data objects. Outlier detection has very important applications in the fields of network intrusion detection, credit card fraud detection, medical diagnosis and so on. Recently, rough set theory has been widely used in outlier detection. However, the classical rough set model can not effectively deal with the numerical and mixed data. Therefore, in this paper we employ the neighborhood rough set model to detect outliers, and introduce a new information entropy model——neighborhood granular entropy in neighborhood rough sets. Based on the neighborhood granularity entropy, a new outlier detection algorithm called OD_NGE is proposed. Experimental results show that OD_NGE has better outlier detection performance than the existing algorithms.

Prediction of Railway Freight Volume Based on GS-LSTM Model

ZHOU Chang-ye, LI Cheng

2022, 0(10): 24-28.

Asbtract ( 453 )

PDF (1146KB) ( 91 )

References | Related Articles | Metrics

The accuracy of railway freight volume forecasting is necessary for railway transportation companies to make marketing plans and marketing decisions, especially the impact of short-term railway freight volume is crucial. In order to improve the prediction accuracy of railway freight volume, this paper establishes a predictive model by optimizing the parameters of the long-short term memory network (GS-LSTM) and by using the grid search algorithm to optimize the most important parameters (batch size, number of hidden layer neural units and learning rate) in the LSTM model training network. Based on the monthly railway freight volume data from January 2005 to July 2021, firstly, BP and LSTM models are established to compare the prediction results. The MAPE of the LSTM model is 1.55 percentage points lower than that of the BP model. Then the network parameters of the BP and LSTM models are optimized and compared, the two optimized models have improved prediction effects than the basic model and the optimized LSTM model is further reduced by 0.18 percentage points than the BP model. The experimental results show that the optimized LSTM model has better prediction effect and better generalization ability, and has good research and utilization value.

A Non-intrusive Load Monitoring Method Based on Improved kNN Algorithm and Transient Steady State Features

TIAN Feng, DENG Xiao-ping, ZHANG Gui-qing, WANG Bao-yi

2022, 0(10): 29-35.

Asbtract ( 491 )

PDF (1558KB) ( 127 )

References | Related Articles | Metrics

Non-intrusive load monitoring (NILM) can obtain the operation data of the electrical appliance in the circuit by analyzing the record from a single energy meter, which can serve as an important tool for energy saving planning and optimal dispatching for power grid. The existing NILM methods mainly focus on improving the accuracy of load identification, the model complexity is too high to be applied on embedded devices. A NILM method based on improved kNN algorithm and transient steady state feature is proposed to solve the above problems. Firstly, the kNN algorithm is selected as the load identification model because it does not require training, the kNN algorithm is improved by statistical method of distance weight, and the cosine similarity judgment mechanism is added to verify the accuracy of the kNN load identification results. Secondly, the transient and steady state features are selected as load characteristics to improve the identification of load features. Finally, experimental data are used to verify that the above NILM method has superior performance.

Distribution Center Site Selection of Fresh Agricultural Products Based on Improved Simulated Annealing Algorithm

RAN Hao-jie, WANG Hong-zhi

2022, 0(10): 36-40.

Asbtract ( 347 )

PDF (1275KB) ( 105 )

References | Related Articles | Metrics

Using the traditional simulated annealing algorithm to solve complex nonlinear programming problems, there is a contradiction between cooling speed and solution quality, which can no longer meet the demand of fresh agricultural products distribution center site selection. To solve this problem, this paper designs a fresh agricultural products distribution center site selection method with improved simulated annealing algorithm, whose core idea is to fuse genetic algorithm with simulated annealing algorithm. Firstly, the chromosome individuals encoded by distribution center are introduced in the search link of the annealing process, and the chromosome sets that meet the conditions of the objective function parameters are screened, then the improved simulated annealing algorithm is applied to realize the overall optimization of the site selection process, and finally the simulation of the site selection problem of fresh agricultural products distribution center of company A in Shandong province is used. The experimental comparison results show that in multiple site selection process, the improved simulated annealing algorithm can effectively reduce the problem of large number of roundabout searches and invalid searches in the late stage of the traditional simulated annealing algorithm, and improve the efficiency of fresh agricultural products distribution center site selection.

Impact Process of Discrete Media Based on HPC

WU Zhi-ping, LIU Bo-ping, WANG Kang, LI Shi-bin, HU Bi-wei, HU Bi-wei, YOU Jie

2022, 0(10): 41-46.

Asbtract ( 233 )

PDF (3213KB) ( 83 )

References | Related Articles | Metrics

Based on a HPC platform, the impact process of discrete media is numerically simulated, and the variation law of particle’s coefficient of restitution (CORs) with impact velocity, particle size distribution is studied. It is found that for a given impact velocity, regardless of monodispersed or multi-dispersed particles, with the increasing of chain’s length, the COR of the particles gradually reached stable, which is only related to the impact velocity and the size ratio of the impacted particles, and the impulse of the compression stage is a stable value which does not vary with chain’s length. At that condition, the COR after the impact between the incident particle and the chain decreases with the increasing of the impact velocity, but it is much larger than the COR of the impact with an infinite half-space solid wall. For the multi-dispersed chain’s COR, its distribution is in the form of a typical Gaussian distribution, and the distributions are related to the impact velocities and the size ratio of the impacted particle.

ABAC Decision Recycling with Dynamic Policy Change

Gulbostan Akim, Nurmamat Helil

2022, 0(10): 47-54.

Asbtract ( 256 )

PDF (1303KB) ( 78 )

References | Related Articles | Metrics

Attribute-based access control (ABAC) becomes one of the most prominent access control models, as it is flexible and highly expressive. However, in ABAC, the burdened policy query tasks of policy decision point (PDP) and the communication between the PDP and policy enforcement point (PEP) affect the efficiency of access control decision making. Recycling of access control decision results is an effective solution for the above problem. This paper proposes an approach of access control decision recycling for ABAC, which supports dynamic policy change, with policy recycling. The presented approach specifies how to create and update the cache of the access control decision and how to make precise and approximate access control decisions based on the contents of the cache. Finally, we verify the feasibility and effectiveness of the approach by a prototype system test. Test results show that the presented approach can shorten the decision time of the access control system and reduce the burden of the PDP.

A Hybrid Modeling Method for Real-time Prediction of Heat Collection in Solar Heat Collection Systems

LIU Hui, DING Xu-dong, YANG Dong-run, ZHANG Ying, LIU Zhong-chen, SUN Mei

2022, 0(10): 55-61.

Asbtract ( 295 )

PDF (2053KB) ( 92 )

References | Related Articles | Metrics

The heat collection of the solar collector is influenced by various factors, such as light intensity, ambient temperature, wind speed and so on, so its prediction model is difficult to meet the user needs both in terms of the prediction accuracy and the real-time performance. A hybrid modeling method to predict the real-time heat collection of solar collector systems is proposed in this paper. The method first starts from energy conservation to derive the theoretical model of the solar collection system according to its heat transfer mechanism, and the empirical parameters, such as the heat dissipation coefficient, the transmissivity, the absorptivity, etc. and the geometric parameters, such as the lighting area, the cooling area, etc. are lumped as the unknown parameters of the model, then the structure of the hybrid model is presented. The TRNSYS simulation software is employed to build a solar collector simulation system, and the simulation experiments on the different operating conditions were performed on the simulation system so as to obtain the steady-state data to identify the unknown parameters of the hybrid model. Finally, the particle optimization algorithm (PSO) is selected as the model parameters identification method to identify the unknown parameters of the model using the obtained steady-state data. The results compared with the model predicted values and the simulation experimental data show that the model is simple and accurate, and can predict the real-time heat collection of solar collector systems with a mean relative error of 2.02% in various working conditions. The model will be widely used in the optimal control of solar heat pumps, solar water heaters and other systems.

Design of Intelligent Retrieval System for Massive Plant Images

QIU Jin-shui, ZHUANG Hui-fu, JIN Tao

2022, 0(10): 62-67.

Asbtract ( 300 )

PDF (1852KB) ( 87 )

References | Related Articles | Metrics

In view of the problems of the plant image retrieval system designed by traditional software technology, such as unable to realize intelligent retrieval, slow growth of the number of plant images, difficult expansion of the retrieval system, low retrieval efficiency when the number of plant images reaches more than one million, and slow loading of plant images when the retrieval requests are highly concurrent, Baidu AI technology, image segmentation technology ImageSharp and color recognition technology CV2 are used to realize intelligent retrieval of plant images. FastDFS technology is used to realize the dynamic expansion, load balancing and rapid loading of plant images of the retrieval system, Solr search engine technology is used to improve the retrieval efficiency of massive plant images, and Python crawler technology is used to continuously enrich the plant images of the retrieval system, so as to realize the sustainable development of the retrieval system. The experimental results show that the above technology can build an intelligent retrieval system for massive plant images.

Fine-grained Image Classification via Channel Adaptive Discriminative Learning

YANG Zhen, SHAN Meng-jiao, YIN Zhi-jian, YANG Fan, LI Cui-mei

2022, 0(10): 68-74.

Asbtract ( 271 )

PDF (3299KB) ( 108 )

References | Related Articles | Metrics

Fine-grained image classification is very challenging due to the limited amount of data with large intra-class differences and small inter-class differences. Since the deep features have strong feature representation capability and the middle layer features can effectively supplement the missing information of the global-level features in fine-grained image identification, in order to take advantages of the convolutional layer feature, this paper proposes a channel adaptive discriminative learning method for fine-grained image classification. In this method, the intermediate features are first aggregated in the channel direction to capture the target position, and then we classify the information obtained by the interactive cascade of the region of interest features. Finally, the proposed method can perform end-to-end training without any bounding box and part annotation. A large number of experiments on three common fine-grained image classification datasets (CUB-200-2011, Stanford Cars and FGVC-Aircraft) have shown that this method can not only maintain simple and reasonable efficiency, but also improve the accuracy, compared with the other methods.

Lightweight Vision Transformer Based on Separable Structured Transformations

HUANG Yan-hui, LAN Hai, WEI Xian

2022, 0(10): 75-81.

Asbtract ( 547 )

PDF (2702KB) ( 113 )

References | Related Articles | Metrics

Due to a large number of parameters and high floating-point calculations of the Visual Transformer model, it is difficult to deploy it to portable or terminal devices. Because the attention matrix has a low-rank bottleneck, the model compression algorithm and the attention mechanism acceleration algorithm cannot well balance the relationship between the amount of model parameters, model inference speed and model performance. In order to solve the above problems, a lightweight ViT-SST model is designed. Firstly, by transforming the traditional fully connected layer into a separable structure, the number of model parameters is greatly reduced and the reasoning speed of the model is improved, and it is guaranteed that the attention matrix will not destroy the model’s expressive ability due to the appearance of low rank. Secondly, this paper proposes a Kronecker product approximate decomposition method based on SVD decomposition, which can convert the pre-training parameters of the public ViT-Base model to the ViT-Base-SST model. It slightly alleviates the overfitting phenomenon of the ViT-Base model and improves the accuracy of the model. Experiments on five common public datasets show that the proposed method is more suitable for the Transformer structure model than traditional compression methods.

Multi-modal Disaster Analysis Based on Embracing Fusion

MEI Xin, MIAO Zi-jing

2022, 0(10): 82-87.

Asbtract ( 415 )

PDF (1013KB) ( 96 )

References | Related Articles | Metrics

The multi-modal information fusion of texts and images can improve the accuracy of disaster event analysis compared with single-modality. However, most of the existing works simply merge the text features and image features, resulting in feature redundancy when extracting and fusing features, while ignoring the relationship between modalities, and the correlation of features between images and texts is not considered. To this end, this article analyzes and studies the current popular multi-modal fusion algorithms, and proposes a multi-modal disaster event analysis algorithm based on embrace fusion. First, the feature vectors of texts and those of images are compared with each other, and the correlation between text features and image features is considered. Then, based on multinomial sampling, the redundancy of features is eliminated, and text features and image features are fused. The experimental results show that the classification accuracy rates of the two tasks of Embrace Fusion on the CrisisMMD2.0 dataset are as high as 88.2% and 85.1%, respectively, which are significantly better than other multimodal fusion models, proving the effectiveness of the model. At the same time, the second experiment also verifies the applicability of the hug model to different text and image deep learning models.

Low Latency Neighbor Discovery Algorithm in Asymmetric Asynchronous Mobile Sensor Networks

HUANG Ting-pei, ZHANG Ya, LI Shi-bao, LIU Jian-hang

2022, 0(10): 88-94.

Asbtract ( 255 )

PDF (1891KB) ( 83 )

References | Related Articles | Metrics

Neighbor discovery is an important part of mobile sensor networks, which is to quickly and effectively sense the neighbors in the one hop range that can communicate with nodes directly. In asymmetric asynchronous MSN, the existing neighbor discovery algorithms need a lot of time and energy to complete mutual discovery. To solve this problem, based on the separation model of beacon and active time slot, a BMCS-A algorithm for asynchronous symmetric scene is proposed. The beacon broadcasts in different time slots of the work cycle to ensure the certainty of neighbor discovery. Secondly, BMCS-A is extended and BMCS-B for persistent broadcast is proposed. The node continuously broadcasts the beacon in the first sub cycle, and the node receiving the beacon will adaptively adjust the beacon sending time to speed up the neighbor discovery process. Finally, the cooperative BMCS-B algorithm is implemented. Based on the sleep wake-up scheduling information of the discovered neighbors, the nodes actively send beacons to discover potential neighbors. Simulation results show that, compared with Searchlight, G-Nihao and Disco, collaborative BMCS-B can reduce the worst case discovery delay by 84.62%, 85.71% and 81.82% respectively.

Named Data Network Cache Optimization Strategy Based on Cache Value

YANG Hao, GAO Quan-li, LI Xue-hua, ZHAO Hui, JIN Shuai, XU Guo-liang

2022, 0(10): 95-99.

Asbtract ( 316 )

PDF (1335KB) ( 80 )

References | Related Articles | Metrics

In order to solve the problems such as the low hit ratio of the router cache caused by the unreasonable utilization of the router cache, and the large delay caused by the excessive hops required to meet user requests, which are existing in the traditional cache decision strategies in the current named data network, such as LCE (leave copy everywhere), LCD (leave copy down), and Prob (copy with probability), a cache strategy based on the cache value is proposed. In this strategy, the packet cache value is calculated based on the number of hops of interested packets, the size of requested packets, and the cache condition of the routing nodes through which interested packets pass, and the packets are cached in appropriate nodes to improve the cache hit ratio. On this basis, considering the filtering effect of downstream nodes, a cache replacement strategy based on dynamic cache value of LFU is proposed compared with traditional cache replacement strategy LRU to further improve cache hit ratio. Through a lot of simulation experiments, the effectiveness and availability of the proposed algorithm are demonstrated.

Dynamic Data Partition Algorithm for Information Network Model

YUAN Jia-li, LIU Meng-chi

2022, 0(10): 100-105.

Asbtract ( 240 )

PDF (914KB) ( 77 )

References | Related Articles | Metrics

Due to the high communication overhead caused by the complex query across nodes in the distributed information network model (INM) database management system, a dynamic data partition and query processing algorithm is proposed. Based on the relationship characteristics of INM model and the historical relationship information, it obtains the initial relevance between data, then mines the potential relevance between data based on the historical query information and dynamically adjusts the data with strong correlation to the same processing node, so as to reduce the number of cross-nodes traversals in complex query. The extensive experiments on synthetic dataset WatDiv are carried out. The experimental results show that the query time of this algorithm is reduced by 35%~55% compared with the consistent hash algorithm in the period by ensuring the load balance of the number of objects and the proportion of relationship pairs between nodes, and the time fluctuation of the same query in multiple periods is controlled within 5%~10%, which ensures the stability of complex queries.

An Off-chain Extended Storage Scheme for Blockchain Traceability

ZHANG Bin, LI Da-peng, JIANG Rui, WANG Xiao-ming

2022, 0(10): 106-112.

Asbtract ( 321 )

PDF (1483KB) ( 84 )

References | Related Articles | Metrics

In the blockchain-based supply chain management traceability system, since the blockchain technology is a distributed system, all nodes will back up the data stored in the blockchain. If the traceability data is directly stored on the chain, it will cause the data to occupy a large amount of memory, increase the maintenance cost of the traceability system and reduce the system response speed. Therefore, an off-chain expansion storage scheme is proposed. Firstly, the scheme uses the unidirectionality of SHA-256 hash algorithm to hash the plaintext data to obtain the hash value, then uses the private key generated by the SM2 encryption algorithm to sign the hash value to ensure the reliability of the identity of the information uploader. Finally, the hash value and signature value are stored in the blockchain through the smart contract. Plaintext data and the address of its hash value and signature value stored on the blockchain are stored in the database. By combining the respective advantages of centralized storage and blockchain technology, it can not only ensure that the traceability data can not be tampered, but also effectively reduce the memory size occupied by the traceability data in the blockchain network. Finally, based on the proposed scheme, the traceability system is designed in detail and implemented using the Ethereum blockchain platform.

Improving Latency and Bandwidth Probe of BBR Congestion Control Algorithm

HUANG Hong-ping, ZHU Xiao-yong, WANG Zhi-yuan,

2022, 0(10): 113-120.

Asbtract ( 669 )

PDF (2082KB) ( 155 )

References | Related Articles | Metrics

The traditional congestion control algorithm based on packet loss can’t meet the requirements of many applications for network performance because of its high packet loss rate and buffer expansion. The BBR (bottleneck bandwidth and round trip) algorithm proposed by Google has attracted extensive attention and research because of its characteristics of anti packet loss, high bandwidth utilization and low delay. However, BBR still has some problems, such as high queuing delay, poor performance in a small RTT (round trip time) environment, untimely bandwidth detection, etc. This paper analyzes the queuing delay and convergence of BBR, and then puts forward an improved method: Limit inflight data, and reduce the congestion window size timely according to the network feedback to reduce the delay; In small RTT environment, the bandwidth estimation before the probe RTT stage is retained to after probe RTT; Set the maximum holding time of steady state, exit the steady cycle in time and enter the detection cycle. The simulation results in NS3 show that the improved BBR reduces the RTT and its jitter, and improves the convergence speed of the algorithm; The bandwidth can be efficiently used in the environment with small RTT; The improved BBR can significantly improve the bandwidth probe frequency of long RTT streams.

Application of Multimodal Fusion TCN-SSDAEs-RF Method in SA Detection

YANG Juan, TENG Fei, GUO Da-lin

2022, 0(10): 121-126.

Asbtract ( 462 )

PDF (1533KB) ( 92 )

References | Related Articles | Metrics

In order to solve the problem that the traditional machine learning method used in sleep apnea (SA) detection requires a lot of work on feature engineering, which leads to low efficiency, and the model uses single-channel signals to extract features and has poor recognition results, a multimodal feature fusion model based on temporal convolutional network (TCN) and stacked sparse denoising auto-encoder (SSDAEs) is proposed to realize automatic feature extraction. The model takes two signals of ECG and breathing as input. First, the TCN network is used to extract the timing features of the input signal, and then the shallow and deep high-dimensional features of the signal are extracted through the SSDAEs. The ECG and respiratory signal features in different feature spaces are fused by a small neural network, and the model is combined with the random forest algorithm to solve the SA fragment detection problem. The experimental results show that the accuracy, sensitivity, and specificity of this method in the detection of SA fragments are 91.5%, 88.9%, and 90.8%, respectively. Compared with previous related studies, it is verified that the SA detection performance of this model is better and the efficiency is higher.

Table of Content