Efficient Execution of User-Provided Graph Algorithms in a Graph Database
Master Project of Daan de Graaf


Abstract

Graph databases are systems to efficiently store and query large graphs. As graph databases grow in popularity, they are used to answer increasingly diverse and complex queries. However, graph databases typically have a very limited query language that cannot express arbitrary algorithms. As a result, many users treat the database as a storage layer to export data from and develop algorithms in external tools, wasting computation power and storage space.

We present graphalg, a high-level, domain-specific language for writing graph algorithms embedded into traditional graph queries. Our language is based on linear algebra, with a syntax resembling GraphBLAS, and implemented in the AvantGraph database.

We implement a compiler for graphalg that can target an interpreter built on top of a GraphBLAS implementation. Alternatively, our compiler can transform graphalg programs into a relational algebra with loops, unifying the representation of query and algorithm. We evaluate the programmability and performance of our system on the GAP Benchmark Suite for graph algorithms. Our language is expressive enough to concisely represent all GAP benchmark programs, with the majority of programs achieving performance comparable to an optimized C implementation.

We conclude that graph algorithm support can be integrated into graph databases to increase their programmability. Running graph algorithms inside of the database increases performance and reduces memory consumption compared to using external tools for the analysis. Rather than thinking of graph databases as limited tools for answering simple queries, we demonstrate that they can instead be a programmable framework for efficient large-scale data analysis.


Efficient Execution of User-Provided Graph Algorithms in a Graph Database

Student: Daan de Graaf
Supervisor(s): Soham Chakraborty
Defended: August 30, 2023