Fault-Tolerance in the Scope of Cloud Computing

Rehman, A. ; Aguiar, R. ; Barraca, J. P.

IEEE Access Vol. 10, Nº 2, pp. 63422 - 63441, June, 2022.

ISSN (print): 2169-3536
ISSN (online):

Scimago Journal Ranking: 0,93 (in 2021)

Digital Object Identifier: 10.1109/ACCESS.2022.3182211

Fault-tolerance methods are required to ensure high availability and high reliability in cloud computing environments. In this survey, we address fault-tolerance in the scope of cloud computing. Recently, cloud computing-based environments have presented new challenges to support fault-tolerance and opened new paths to develop novel strategies, architectures, and standards. We provide a detailed background of cloud computing to establish a comprehensive understanding of the subject, from basic to advanced. We then highlight fault-tolerance components and system-level metrics and identify the needs and applications of fault-tolerance in cloud computing. Furthermore, we discuss state-of-the-art proactive and reactive approaches to cloud computing fault-tolerance. We further structure and discuss current research efforts on cloud computing fault-tolerance architectures and frameworks. Finally, we conclude by enumerating future research directions specific to cloud computing fault-tolerance development.