|
05 July 2023, Volume 38 Issue 3
|
|
|
Abstract
Cloud computing is an on-demand model of computing that utilizes virtualization expertise to offer cloud resources such as CPU, memory, storage, and network to customers in the usage of virtual machines. As a result, most big data analytics in many modern enterprise applications are run from the cloud. Since resources in these private clouds are limited, getting the most out of resource applications and providing guaranteed service to users is the ultimate goal by efficiently scheduling tasks and resources. However, existing schedulers in big data processing systems do not consider both application performance and resource utilization when performing allocations. Therefore, it is difficult to design workflows to accomplish low turnaround time and high resource consumption in big data systems. In this paper, we propose a resource management system for efficient job scheduling, called RMS, which dynamically schedules big data jobs in Kubernetes cluster nodes for Spark applications, and autonomously adjusts scheduling policies in heterogeneous node clusters to enhance application execution and resource consumption. The RMS mechanism will ensure that there is sufficient guidance and resources available in its planning objectives and a satisfactory level of resource utilization. The experimental analysis of different RMS and performance preferences using different methods depends on the predicted completion time and the benchmark statistical result of different big data performance indicators traces. The results show that RMS decreases the cost and scheduling overhead and improves job execution performance.
Keyword
Cloud computing, Big Data, Job Scheduling, Resource Management, Kubernetes, Spark.
PDF Download (click here)
|