Computer and Modernization ›› 2022, Vol. 0 ›› Issue (01): 41-53.

Previous Articles     Next Articles

Review of Big Data Workflow Orchestration and Management System in Cloud Environment

  

  1. (The Fifth Elementary Department, North China Institute of Computing Technology, Beijing 100083, China)
  • Online:2022-01-24 Published:2022-01-24

Abstract: With the increasing complexity of big data analysis and processing  requirements, the expression of the analysis and processing process needs to be transformed into the form of a big data workflow constructed based on tasks and inter-task dependencies in order to achieve its structured, repeatable, controllable, scalable and automated execution. The issue of big data workflow orchestration and management has become an important research topic. The heterogeneity of resources in the cloud computing environment  has made this problem more complicated. This paper first divides the research contents on big data workflow orchestration and management in the cloud environment into four aspects, big data workflow composition, workflow fragmentation, task scheduling and execution, and fault tolerance, and on this basis, it reviews and introduces classic and highly concerned researches in recent years each aspect; then, it classifies and sorts out the mainstream technologies in these researches, and analyzes the methods proposed in each research and their characteristics, advantages, and items to be improved. Finally, the perspective is returned to the big data analysis and processing system, and the benefits of various studies to the system are classified and analyzed.

Key words: big data, cloud computing, data analysis, workflow, orchestration and management