ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

GoDB: From Batch Processing to Distributed Querying over Property Graphs

Jamadagni, Nitin and Simmhan, Yogesh (2016) GoDB: From Batch Processing to Distributed Querying over Property Graphs. In: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), MAY 16-19, 2016, Cartagena, COLOMBIA, pp. 281-290.

[img] PDF
CCGRID_281_2016.pdf - Published Version
Restricted to Registered users only

Download (849kB) | Request a copy
Official URL: http://dx.doi.org/10.1109/CCGrid.2016.105

Abstract

Property Graphs with rich attributes over vertices and edges are becoming common. Querying and mining such linked Big Data is important for knowledge discovery and mining. Distributed graph platforms like Pregel focus on batch execution on commodity clusters. But exploratory analytics requires platforms that are both responsive and scalable. We propose Graph-oriented Database (GoDB), a distributed graph database that supports declarative queries over large property graphs. GoDB builds upon our GoFFish subgraph-centric batch processing platform, leveraging its scalability while using execution heuristics to offer responsiveness. The GoDB declarative query model supports vertex, edge, path and reachability queries, and this is translated to a distributed execution plan on GoFFish. We also propose a novel cost model to choose a query plan that minimizes the execution latency. We evaluate GoDB deployed on the Azure IaaS Cloud, over real-world property graphs and for a diverse workload of 500 queries. These show that the cost model selects the optimal execution plan at least 80% of the time, and helps GoDB weakly scale with the graph size. A comparative study with Titan, a leading open-source graph database, shows that we complete all queries, each in <= 1.6 secs, while Titan cannot complete up to 42% of some query workloads.

Item Type: Conference Proceedings
Additional Information: Copy right for this article belongs to the IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA
Department/Centre: Division of Interdisciplinary Research > Computational and Data Sciences
Depositing User: Id for Latest eprints
Date Deposited: 22 Oct 2016 10:25
Last Modified: 23 Oct 2018 14:50
URI: http://eprints.iisc.ac.in/id/eprint/55116

Actions (login required)

View Item View Item