ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

A Distributed Path Query Engine for Temporal Property Graphs

Ramesh, S and Baranawal, A and Simmhan, Y (2020) A Distributed Path Query Engine for Temporal Property Graphs. In: 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020, 11-14, May 2020, Australia, pp. 499-508.

[img] PDF
iee_int_sym_clu_clo_int_com_2020.pdf - Published Version
Restricted to Registered users only

Download (553kB) | Request a copy
Official URL: https://dx.doi.org/10.1109/CCGrid49817.2020.00-43

Abstract

Property graphs are a common form of linked data, with path queries used to traverse and explore them for enterprise transactions and mining. Temporal property graphs are a recent variant where time is a first-class entity to be queried over, and their properties and structure vary over time. These are seen in social, telecom and transit networks. However, current graph databases and query engines have limited support for temporal relations among graph entities, no support for time-varying entities and/or do not scale on distributed resources. We address this gap by extending a linear path query model over property graphs to include intuitive temporal predicates that operate over temporal graphs. We design a distributed execution model for these temporal path queries using the interval-centric computing model, and develop a novel cost model to select an efficient execution plan from several. We perform detailed experiments of our G distributed query engine using temporal property graphs as large as 52M vertices, 218M edges and 118M properties, and an 800-query workload, derived from the LDBC benchmark. We offer sub-second query latencies in most cases, which is 149�-1140� faster compared to industry-leading Neo4J shared-memory graph database and the JanusGraph/Spark distributed graph query engine. Further, our cost model selects a query plan that is within 10 of the optimal execution time in 90 of the cases. We also scale well, and complete 100 of the queries for all graphs, compared to only 32-92 by baseline systems. © 2020 IEEE.

Item Type: Conference Paper
Publication: Proceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright of this article belongs Institute of Electrical and Electronics Engineers Inc.
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 17 Aug 2020 12:14
Last Modified: 17 Aug 2020 12:14
URI: http://eprints.iisc.ac.in/id/eprint/66319

Actions (login required)

View Item View Item