Algebra of Graphs

Posted on February 10, 2021

For with much wisdom comes much sorrow; the more knowledge, the more grief.

Ecclesiastes 1:18

Intro

A friend of mine with tons of experience in computers was asked to take a home assignment proving that he can program graphs that can:

  • hold 100M vertices with 50 edges per vertex without maxing out 128 gigs of ram
  • answer adjacency queriesFollowers and following queries in terms of social networks.

    taking direction
  • answer directed pathFriends of friends queries ( they are located two levels bellow the vertex in question in its directed path).

    and level queries
  • have excellent space & time characteristics

He didn’t feel right to badly re-implement already existing and awesome implementations, using hand crafted adjacency listsThe adjacency list is a two-dimensional structure, where each element of the first dimension represents a vertex, and each of the vertices contains a one-dimensional structure that is its edge list… read more

, adjacency matrices

adjacency matrix read more

, compressed bitsetsThe bitset data structure is a clever way to represent efficiently sets of integers. It supports fast set operations such as union, difference, intersection. read more

, only to have to deal afterwards with bfsBreadth-first search (BFS) is an algorithm for traversing or searching tree or graph data structures. read more

, dfsDepth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. read more

, topological sortLinear ordering of a directed graph’s vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. read more

, and the lot…

I guess he just wanted to mourn over his ego for it was hurt times and times again by tech interviews wanting to prove that after 20+ years of coding he knows his Sets, Maps and Lists.

Anyway. That’s his problem. I, on the other hand, felt that this was an opportunity to dust out my graph theory and check out some novel research by my fellow functional programming geeks.

Graphs 101

I’ve never liked working with graphs because the commonly used type is something like:

type Vertex = number
type Graph = {
    vertices : Vertex[]
    edges : [Vertex,Vertex][]
}

So this:

const good : Graph = {
    vertices : [1,2,3],
    edges : [[1,2],[2,3]]
}

is the graph with three vertices [1,2,3] and two edges [[1,2],[2,3]]

But what about:

const bad : Graph = {
    vertices : [1],
    edges : [[1,2]]
}

The edge refers to a vertex 2 that is simply not there. Working with this data types is a source of a lot runtime checks, NPE, and lack of general type safety.

Algebraic Graphs

Fortunately for me I found this fellow Andrey Mokhov by stumbling on his whitepaper of algebraic graphs . I was pretty hyped that his first words were:

Graphs are ubiquitous in computing, yet working with graphs often requires painfully low-level fiddling with sets of vertices and edges.

Who likes working with untyped sets of vertices and edges ? Not me for sure. I kept reading and I was amazed by the sheer quality and knowledge coming from the whitepaper. It’s a highly recommended read !!!

Then I found out that he gave two talks on the subject of algebraic graphs that you can watch here and here . The second one is longer and more recent, but it requires to create an account in order to watch it.

So far we have a whitepaper, talks, but what about an implementation ?

Well Andrey (aka snowleopard) got us covered. For me the most important thing is, that not only that he had succeeded in creating an awesome api, but he also managed to have an excellent space and time characteristics for his lib.

Alga can handle graphs comprising millions of vertices and billions of edges in a matter of seconds, which is fast enough for many applications

I guess it will not be boostBoost Graph Library read more

level fast, but it’s pretty much where it should. You can find links to the benchmarks in github.

Implementing the Home Assignment

I haven’t touched Haskell for more or less a year, but I braced myself and delivered. Here is a Github link to the implementation. You can download binaries for your OS from here. It has a simple but functional cli that can help you:

  • create graphs by hand
  • generate random graphs a la Erdős–RényiIn the model of Erdős and Rényi for generating random graphs, all graphs on a fixed vertex set with a fixed number of edges are equally likely read more

  • answer popular queries
  • visualize queries ( export to DOT / Graphviz )

Check out the github page for the source & tutorial how to use the cli.

Much love and prosperity. Bobby, out.