Creating and sharing knowledge for telecommunications

Optimized Big Data Management across Multi-Cloud Data Centers: Software-Defined-Network-Based Analysis

Chaudhary, R. ; Aujla, G. ; Kumar, N. K. ; Rodrigues, J. R.

IEEE Communications Magazine Vol. 56, Nº 2, pp. 118 - 126, February, 2018.

ISSN (print): 0163-6804
ISSN (online):

Journal Impact Factor: 4,007 (in 2014)

Digital Object Identifier: 10.1109/MCOM.2018.1700211

Download Full text PDF ( 391 KBs)

With an exponential increase in smart device users, there is an increase in the bulk amount of data generation from various smart devices, which varies with respect to all the essential V’s used to categorize it as big data. Generally, most service providers, including Google, Ama- zon, Microsoft and so on, have deployed a large number of geographically distributed data centers to process this huge amount of data generated from various smart devices so that users can get quick response time. For this purpose, Hadoop, and SPARK are widely used by these service pro- viders for processing large datasets. However, less emphasis has been given on the underlying infra- structure (the network through which data flows), which is one of the most important components for successful implementation of any designed solution in this environment. In the worst case, due to heavy network traffic with respect to data migrations across different data centers, the underlying network infrastructure may not be able to transfer data packets from source to destination, resulting in performance degradation. Focus- ing on all these issues, in this article, we propose a novel SDN-based big data management approach with respect to the optimized network resource consumption such as network bandwidth and data storage units. We analyze various compo- nents at both the data and control planes that can enhance the optimized big data analytics across multiple cloud data centers. For example, we ana- lyze the performance of the proposed solution using Bloom-filter-based insertion and deletion of an element in the flow table maintained at the OpenFlow controller, which makes most of the decisions for network traffic classification using the rule-and-action-based mechanism. Using the proposed solution, developers can deploy and analyze real-time traffic behavior for the future big data applications in MCE.