(1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China;
2. School of Computer Science and Engineering, Nanjing University of Science & Technology, Nanjing 210094, China)
ZHAO Cheng-long1, SHI Hui-bin1, YU Xin-feng2. Dual GPU Radix Sort Algorithm Based on OpenCL[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2015.01.005.
[1]Huang Bonan, Gao Jinlan, Li Xiaoming. An empirically optimized radix sort for GPU[C]// Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications. 2009:234-241.
[2]Zehra Yildiz, Musa Aydin, Guray Yilmaz. Parallelization of bitonic sort and radix sort algorithms on many core GPUs[C]// Proceedings of the 2013 IEEE International Conference on Electronics Computer and Computation (ICECCO). 2013:326-329.
[3]Marco Zagha, Guy E Blelloch. Radix sort for vector multiprocessors[C] //Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. 1991:712-721.
[4]Rakesh N. Nitin N Parallel prefix sum computation on multi mesh of trees[C]//Proceedings of the 2009 Annual IEEE India Conference (INDICON). 2009:1-4.
[5]Nan Zhang. A novel parallel prefix sum algorithm and its implementation on multicore platforms[C]//Proceedings of the 2010 2nd International Conference on Computer Engineering and Technology. 2010:66-70.
[6]Capannini G. Designing efficient parallel prefix sum algorithms for GPUs[C]//Proceedings of the 2011 IEEE 11th International Conference on Computer and Information Technology(CIT). 2011:189-196.
[7]彭海洋,杨红雨,杨光. 基于GPU实现的AES加密[J]. 计算机技术与发展, 2013,23(2):241-244.
[8]曾文权,胡玉贵,何拥军,等. 一种基于OPENACC的GPU加速实现高斯模糊算法[J]. 计算机技术与发展, 2013,23(7):147-149.
[9]詹云,赵新灿,谭同德. 基于OpenCL的异构系统并行编程[J]. 计算机工程与设计, 2012,33(11):4191-4195.
[10]陈钢,吴百锋. 面向OpenCL模型的GPU性能优化[J]. 计算机辅助设计与图形学学报, 2011,23(4):571-581.