Chaturvedi, S and Tyagi, S and Simmhan, Y (2021) Cost-Effective Sharing of Streaming Dataflows for IoT Applications. In: IEEE Transactions on Cloud Computing, 9 (4). pp. 1391-1407.
PDF
iee_tra_clo_com_9-4_1391-1407_2021 - Published Version Restricted to Registered users only Download (3MB) | Request a copy |
Abstract
Internet of Things (IoT) applications are often designed as dataflows that analyze sensor data in real-time to make decisions. Stream processing systems like Apache Storm execute these on Cloud infrastructure. As IoT applications within shared data environments like smart cities grow, they will duplicate tasks like pre-processing and analytics. This offers the opportunity to collaboratively reuse the outputs of overlapping dataflows, improving the resource efficiency on Clouds. We propose dataflow reuse algorithms that when given a submitted dataflow, identify the intersection of reusable tasks and streams from existing dataflows to form a merged dataflow, with guaranteed equivalence of their output streams. Algorithms to unmerge dataflows when they are removed, and defragment partially reused dataflows are also proposed. We implement these algorithms for the Storm fast-data platform, and validate their performance and resource savings using 86 real and synthetic dataflows from eScience and IoT domains. Our reuse strategies reduce the number of running tasks by 34-45 percent and the cumulative CPU usage by 29-63 percent. Including defragmentation of incremental dataflows achieves a monetary savings on Cloud resources of 36-44 percent compared to dataflows without reuse, and has limited redeployment overheads. © 2013 IEEE.
Item Type: | Journal Article |
---|---|
Publication: | IEEE Transactions on Cloud Computing |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Cost effectiveness; Data flow analysis; Data handling; Distributed parameter control systems; Storms, Cloud elasticities; Cost effective; Dataflow; Dataflow reuse; Distributed stream processing; Fast data; Real- time; Reuse; Sensors data; Stream processing systems, Internet of things |
Department/Centre: | Division of Interdisciplinary Sciences > Computational and Data Sciences |
Date Deposited: | 04 Jan 2022 05:28 |
Last Modified: | 04 Jan 2022 05:28 |
URI: | http://eprints.iisc.ac.in/id/eprint/70850 |
Actions (login required)
View Item |