Marko A. Rodriguez

A&nbsp;<a href="http://tinkerpop.apache.org/gremlin.html">Gremlin traversal</a>&nbsp;is an abstract description of a legal path through a graph. In the beginning, a single traverser is created that will birth more traversers as a function of the instructions dictated by the traversal. A branching familial tree of traversers is generated from this one primordial, patient zero, adamic traverser. Many traversers will die along the way. They will be filtered out, they will walk down dead-end subgraphs, or they will meet other such fates which conflict with the specification of the traversal as defined by the user (the true sadist in this story). However, the traversers that are ultimately returned are the result of a traverser lineage that has survived the traversal-guided journey across the graph. These traversers are recognized for the answers they provide, but it is only because of the unsung heroes that died along the way that we know that their results are&nbsp;<a href="https://en.wikipedia.org/wiki/Soundness">sound</a>&nbsp;and&nbsp;<a href="https://en.wikipedia.org/wiki/Completeness_(logic)">complete</a>.

<hr />
Suppose the following&nbsp;<code>Traversal</code>&nbsp;below that answers the question:&nbsp;What is the distribution of labels of the vertices known by people?"&nbsp;That is, what are the types and counts of the things that people know? This traversal assumes a graph where a person might know an animal, a robot, or just maybe, another person. The result is a&nbsp;<code>Map</code>&nbsp;such as&nbsp;<code>[person:107, animal:1252, robot:256]</code>.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<code>g.V().hasLabel(</code><code>"person"</code><code>).</code>

			<code>&nbsp;&nbsp;</code><code>out(</code><code>"knows"</code><code>).label().</code>

			<code>&nbsp;&nbsp;</code><code>groupCount()</code>
			</td>
		</tr>
	</tbody>
</table>

Every legal path of the traversal through the graph is walked by a&nbsp;<code>Traverser</code>. A traverser holds a reference to both its current object in the graph (e.g. a&nbsp;<code>Vertex</code>) and its current&nbsp;<code>Step</code>&nbsp;in the traversal (e.g.&nbsp;<code>label()</code>) . If the traverser is currently at vertex&nbsp;<code>v[1]</code>&nbsp;and step&nbsp;<code>label()</code>, then the traverser will walk to the&nbsp;<code>String</code>&nbsp;label of&nbsp;<code>v[1]</code>. As such,&nbsp;<code>label()</code>&nbsp;executes a&nbsp;one-to-one&nbsp;mapping (<code>MapStep</code>) as a vertex can have one and only one label. A&nbsp;one-to-many&nbsp;mapping (<code>FlatMapStep</code>) occurs if the traverser is currently at&nbsp;<code>v[1]</code>&nbsp;and at the step&nbsp;<code>out("knows")</code>. In this situation, the traverser will branch the traverser family tree by&nbsp;splitting&nbsp;itself across all "knows"-adjacent vertices of&nbsp;<code>v[1]</code>. A&nbsp;many-to-one&nbsp;mapping occurs via a&nbsp;<code>ReducingBarrierStep</code>&nbsp;which aggregates all the traversers up to that step and then emits a single traverser representing an analysis of that aggregate. The&nbsp;<code>groupCount()</code>-step is an example of a reducing barrier step. Finally, there is a&nbsp;one-to-maybe&nbsp;mapping (<code>FilterStep</code>). The step&nbsp;<code>hasLabel("person")</code>&nbsp;will either let the traverser pass if it is at a person vertex or it will filter it out of the data stream.

The generic form of the step instances mentioned above are the fundamental processes of any Gremlin traversal. It is important to note that a traversal does not define how these processes are to be evaluated. It is up to the _Gremlin traversal machine_ to determine the means by which the traversal is executed. The Gremlin traversal machine is an&nbsp;<a href="https://en.wikipedia.org/wiki/Virtual_machine#Abstract_virtual_machine_techniques">abstract computing machine</a>&nbsp;that is able to execute Gremlin traversals against any TinkerPop-enabled&nbsp;graph system. In general, the machine's algorithm moves traversers (pointers) through a graph (data) as dictated by the steps (instructions) of the traversal (program). The Gremlin traversal machine distributed by Apache TinkerPop™ provides two implementations of this algorithm.

<img alt="furnace" data-entity-type="file" data-entity-uuid="6f43909b-c0f6-451d-91a8-204e3ef6dcb9" src="https://www.datastax.com/sites/default/files/inline-images/furnace-character-1.png" />

<ol>
	<li>Chained Iterator Algorithm&nbsp;(<a href="https://en.wikipedia.org/wiki/Online_transaction_processing">OLTP</a>): Each step in the traversal reads an iterator of traversers from "the left" and outputs an iterator of traversers to "the right" in a stream-based,&nbsp;<a href="https://en.wikipedia.org/wiki/Lazy_evaluation">lazy fashion</a>. This is also known as the standard OLTP execution model.</li>
	<li>Message Passing Algorithm&nbsp;(<a href="https://en.wikipedia.org/wiki/Online_analytical_processing">OLAP</a>): In a distributed environment, each step is able to read traverser&nbsp;messages&nbsp;from "the left" and write traversers&nbsp;messages&nbsp;to "the right." If a message references an object that is locally accessible, then the traverser message is further processed. If the traverser references a remote object, then the traverser is serialized and it continues its journey at the remote location. This is also known as the computer OLAP execution model.</li>
</ol>

While both algorithms are semantically equivalent, the first is&nbsp;pull-based&nbsp;and the second is&nbsp;push-based. Apache TinkerPop™'s Gremlin traversal machine supports both modes of execution and thus, is able to work against both OLTP&nbsp;graph databases&nbsp;and OLAP&nbsp;graph processors. Selecting which algorithm is used is a function of defining a&nbsp;<code>TraversalSource</code>&nbsp;that will be used for subsequent traversals.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<code>g = graph.traversal()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </code><code>// OLTP</code>

			<code>g = graph.traversal().withComputer() </code><code>// OLAP</code>
			</td>
		</tr>
	</tbody>
</table>

This article is specifically about Gremlin OLAP and its&nbsp;<a href="https://en.wikipedia.org/wiki/Message_passing">message passing</a>&nbsp;algorithm. The following sections will discuss different aspects of this algorithm in order to help elucidate the mechanics of Gremlin OLAP.

<hr />
<h2>Vertex-Centric Computing</h2>

<img alt="olap master worker" data-entity-type="file" data-entity-uuid="224bab97-6042-4e65-a931-7dcf63d66ac0" src="https://www.datastax.com/sites/default/files/inline-images/olap-master-workers.png" />

Every TinkerPop-enabled OLAP graph processor implements the&nbsp;<code>GraphComputer</code>&nbsp;interface. A&nbsp;<code>GraphComputer</code>&nbsp;is able to evaluate a&nbsp;<code>VertexProgram</code>. A vertex program can be understood as a "chunk of code" that is evaluated at each vertex in a (logically) parallel manner. In this way, the computation happens from the "perspective" of the vertices and thus, the name&nbsp;vertex-centric computing. Another term for this distributed computing model is&nbsp;<a href="https://en.wikipedia.org/wiki/Bulk_synchronous_parallel">bulk synchronous parallel</a>. The vertex program's chunk does three things in a&nbsp;<code>while(!terminated)</code>-loop:

<ol>
	<li>It reads messages sent to its vertex.</li>
	<li>It alters its vertex's state in some way.</li>
	<li>It sends messages to other vertices (adjacent or otherwise)</li>
</ol>

The vertex program typically terminates when there are no more messages being sent.&nbsp;<code>TraversalVertexProgram</code>&nbsp;is a particular vertex program distributed with&nbsp;<a href="http://tinkerpop.apache.org/">Apache TinkerPop™</a>&nbsp;that knows how to evaluate a Gremlin traversal using message passing. The chunk of code is (logically) distributed to each vertex which contains a&nbsp;<code>Traversal.clone()</code>&nbsp;(a&nbsp;worker traversal). A vertex receives&nbsp;traverser messages&nbsp;that reference a step in the traversal clone. That step is evaluated. If the result is a traverser that does not reference data at the local vertex, then the traverser is messaged away to where that data is. This process continues until no more traverser messages exist in the computation. Besides the distributed worker traversals, there also exists a single&nbsp;master traversal&nbsp;that serves as the coordinator of the computation -- determining when the computation is complete and handling global barriers that synchronize the workers at particular steps in the traversal.

<hr />
<h2>Worker Graph Partitions</h2>

The graph data structure ingested by any OLAP&nbsp;<code>GraphComputer</code>&nbsp;is an&nbsp;<a href="https://en.wikipedia.org/wiki/Adjacency_list">adjacency list</a>. Each entry in this list represents a vertex, its properties, and its incident edges. In TinkerPop, a single vertex entry is known as a&nbsp;<code>StarVertex</code>. Thus, the adjacency list read by a graph processor can be abstractly defined as&nbsp;<code>List</code>. Typically, a graph processor supports parallel execution whether parallelization is accomplished via threads in a machine, machines in a cluster, or threads in machines in a cluster. Each parallel worker processes a subgraph of the entire graph called a&nbsp;graph partition. The partitions of the graph's adjacency list are abstractly defined as&nbsp;<code>List&lt;List&gt;</code>. If the list is&nbsp;<code>partitions</code>&nbsp;and there are&nbsp;n-workers, then&nbsp;<code>partitions.size() == n</code>&nbsp;and worker&nbsp;<code>i</code>&nbsp;is responsible for processing&nbsp;<code>partitions.get(i)</code>.

<img alt="adjacency list" data-entity-type="file" data-entity-uuid="4b3f96f1-0f84-4865-b56a-275367aa0872" src="https://www.datastax.com/sites/default/files/inline-images/adjacency-list.png" />

What does worker&nbsp;<code>i</code>&nbsp;do with its particular&nbsp;<code>List</code>-partition? The worker will iterate through the list and for each&nbsp;<code>StarVertex</code>&nbsp;it will process any messages associated with that vertex. For&nbsp;<code>TraversalVertexProgram</code>, the messages are simply&nbsp;<code>Traversers</code>. If the traverser's current graph (data) location is&nbsp;<code>v[1]</code>, then it will&nbsp;attach&nbsp;itself to&nbsp;<code>v[1]</code>&nbsp;and then evaluate its current step (instruction) location in the worker's&nbsp;<code>Traversal.clone()</code>. That step will yield output traversers according to it form: one-to-many, many-to-one, one-to-one, etc. If the output traversers reference objects at&nbsp;<code>v[1]</code>, then they will continue to execute. For instance,&nbsp;<code>outE()</code>&nbsp;will put a traverser at every outgoing incident edge of&nbsp;<code>v[1]</code>, where these incident edges are contained in the&nbsp;<code>StarVertex</code>&nbsp;data structure. Moreover,&nbsp;<code>values("name")</code>&nbsp;will put a traverser at the&nbsp;<code>String</code>&nbsp;name of&nbsp;<code>v[1]</code>. There are three situations that do not allow the traverser to continue its processing at the current&nbsp;<code>StarVertex</code>.

<ol>
	<li>The traverser no longer references a step in&nbsp;<code>Traversal</code>&nbsp;and at which point it&nbsp;halts. If this occurs, this means the traverser has completed its journey through the graph and traversal and it is stored in a special vertex property called&nbsp;<code>HALTED_TRAVERSERS</code>&nbsp;containing all the traversers that have halted at the respective&nbsp;<code>StarVertex</code>. A halted traverser is a subset of the final result.</li>
	<li>The traverser no longer references an object at the local&nbsp;<code>StarVertex</code>&nbsp;and thus, must turn itself into a message and transport itself to the&nbsp;<code>StarVertex</code>&nbsp;that it does reference. The traverser&nbsp;detaches&nbsp;itself and serializes itself across the network (or stored locally if the&nbsp;<code>StarVertex</code>&nbsp;in question is accessible at the current worker's partition).</li>
	<li>The traverser no longer references any data object and thus, is considered dead and is removed from the computation. This occurs when the previous graph location of the traverser is deemed not acceptable by the traversal.</li>
</ol>

This message passing process continues for all workers until all traversers are either destroyed or halted. The final answer to the traversal query is the aggregation of all the graph locations of the halted traversers distributed across the&nbsp;<code>HALTED_TRAVERSERS</code>&nbsp;of the vertices.

<hr />
<h2>Barrier Synchronization</h2>

<img alt="barrier" data-entity-type="file" data-entity-uuid="fc0cb2f2-46be-45f8-9ea2-572dc61bc159" src="https://www.datastax.com/sites/default/files/inline-images/barrier.png" />

There are some steps whose computation can not be evaluated in parallel and require an aggregation at the master traversal. Such steps implement an interface called `Barrier` and include&nbsp;<code>count()</code>,&nbsp;<code>max()</code>,&nbsp;<code>min()</code>,&nbsp;<code>sum()</code>,&nbsp;<code>fold()</code>,&nbsp;<code>groupCount()</code>,&nbsp;<code>group()</code>, etc. Barrier steps are handled in a special way by the&nbsp;<code>TraversalVertexProgram</code>. When a traverser enters a barrier step at a worker traversal, it does not come out the other side. Instead,&nbsp;<code>Barrier.nextBarrier()</code>&nbsp;is used to grab all the traversers that were barriered at the current worker and then they are sent to the master traversal for aggregation along with other sibling worker barriers of the analogous step. For&nbsp;<code>ReducingBarrierSteps</code>, distributed processing occurs to yield a barrier that is not the aggregate of all traversers, but instead, an aggregate of their reduced associative/commutative form. For instance,&nbsp;<code>CountStep.nextBarrier()</code>&nbsp;produces a single&nbsp;<code>Long</code>&nbsp;number traverser. The master traversal's representation of the barrier step aggregates all the distributed barriers via&nbsp;<code>Barrier.addBarrier()</code>. Then that master barrier step, like any other step, is&nbsp;<code>next()</code>'d to generate the single traverser from the many. If that single traverser references a graph object, it is messaged to the respective&nbsp;<code>StarVertex</code>&nbsp;for further processing by a worker traversal.

<img alt="message pass" data-entity-type="file" data-entity-uuid="5c0b873f-ee09-4e74-af69-14b6c1965c82" src="https://www.datastax.com/sites/default/files/inline-images/message-pass.png" />

An OLAP traversal undulates from a distributed execution across worker traversal instances, to a local execution at the master traversal, back to a distributed execution across workers, so forth and so on until all traversers have halted and the computation is complete. Note that there are other interesting barrier concepts such as `LocalBarrier` that can be studied by the interested reader in Apache TinkerPop™'s documentation.

<hr />
<h2>The Future of Gremlin OLAP</h2>

As of Apache TinkerPop™ 3.2.0, Gremlin OLAP's&nbsp;<code>GraphComputer</code>&nbsp;assumes that the input data is organized as an adjacency list (i.e.&nbsp;<code>List</code>). Moreover, it assumes that each worker processes a subset of that list and that when a traverser leaves the current&nbsp;<code>StarVertex</code>, it must send itself to the respective remote&nbsp;<code>StarVertex</code>&nbsp;that it does reference. These two assumptions can be lifted in order to support&nbsp;<code>GraphComputer</code>&nbsp;implementations that may be more efficient (and/or expressive) for certain types of graphs and traversals.

<ol>
	<li>Subgraph-Centric Computing: If a single worker partition can hold its entire&nbsp;<code>List</code>&nbsp;partition in memory, then when a traverser leaves the current&nbsp;<code>StarVertex</code>, it may still be able to execute deeper within the local partition's in-memory subgraph representation. Only when a traverse leaves a partition's subgraph would a message pass be required. This would significantly increase the speed of OLAP at the expense of requiring subgraphs to fit into memory. This model would also benefit greatly from a good partitioning strategy that ensures that worker subgraphs have more inter-partition edges than intra-partition edges.</li>
	<li>Edge-Centric Computing: A single&nbsp;<code>StarVertex</code>&nbsp;may contain a significant amount of data especially as the graph grows. For example, famous people on Twitter can have on the order of 10 million+ incoming follows-edges. In order to reduce the memory requirements of the OLAP processor as well as to better load balance a computation across machines, an edge-centric model can be used where the OLAP ingested graph is an edge-list abstractly defined as&nbsp;<code>List</code>.</li>
</ol>

Both these models may one day be introduced into the current&nbsp;<code>GraphComputer</code>&nbsp;model. If so,&nbsp;<a href="http://tinkerpop.apache.org/">Apache TinkerPop™</a>&nbsp;would support vertex-centric, subgraph-centric, and edge-centric computing spanning the gamut of useful distributed graph computing models. Fortunately, the user would be blind to the underlying execution algorithm. Behind the scenes, a traversal would infer its space/time-requirements and ask the&nbsp;<code>GraphComputer</code>&nbsp;to use a particular representation best suited for its evaluation.

<hr />
<h2>Conclusion</h2>

The&nbsp;<code>TraversalVertexProgram</code>&nbsp;that drives the evaluation of a distributed&nbsp;<code>Traversal</code>&nbsp;is simple, containing only a few hundred lines of code. The complexity of the computation resides in both the vendor's&nbsp;<code>GraphComputer</code>&nbsp;implementation and Apache TinkerPop™'s&nbsp;<code>Traversal</code>&nbsp;implementation.&nbsp;<code>TraversalVertexProgram</code>&nbsp;merely stands between these two constructs routing traversers amongst worker partitions in order to effect a distributed, OLAP-based evaluation of a Gremlin traversal over a TinkerPop-enabled graph processor.

The Mechanics of Gremlin OLAP

Marko A. Rodriguez

Share

Share

Vertex-Centric Computing

Worker Graph Partitions

Barrier Synchronization

The Future of Gremlin OLAP

Conclusion

More Technology

How to Build a Crystal Image Search App with Vector Search

Knowledge Graphs for RAG without a GraphDB

How Winweb Built its AI Assistant with DataStax Astra DB and LangChain

Vercel + Astra DB: Get Data into Your GenAI Apps Fast

One-stop Data API for Production GenAI