ArangoDB v2.8 reached End of Life (EOL) and is no longer supported.
This documentation is outdated. Please see the most recent version here: Try latest
Filtering nodes in a graph using AQL
Problem
I want to run a query on a graph using AQL, but only want to include certain vertices or follow specific edges of the graph.
Solution
Use a traversal with a custom filter function. A custom filter function is user-defined function that is called for every node that is encountered during a traversal. The purpose of the filter function is to control if a given vertex is going to be visited (i.e. if the traversal’s visitor function should be called for it) and if its connections/edges should be followed. A filter function can be used to skip non-interesting vertices early, providing a good way to make traversals more efficient or reduce their result size.
Example data setup
We first need some data that we can filter from. I suggest using the example graph and data
from here.
Please download the code from the link and store it in the filesystem using a filename
of world-graph-setup.js
. Then start the ArangoShell and run the code from the file:
require("internal").load("/path/to/file/world-graph-setup.js");
The data in the example graph connects capitals to countries, countries to continents, and all continents to a single root node. The structure looks like this:
root <--[is in]-- continent <--[is in]-- country <--[is in]-- capital
Registering a custom filter function
User-defined filter functions are written in JavaScript. Once a function is written, it needs to be registered on the server to become usable from inside an AQL query.
The following snippet, to be executed in the ArangoShell, registers a custom filter function
named myfunctions::europeFilter
:
var aqlfunctions = require("org/arangodb/aql/functions");
aqlfunctions.register("myfunctions::europeFilter", function (config, vertex, path) {
if (vertex.type === "country") {
return "prune";
}
if (vertex.type === "continent") {
if (vertex.name !== "Europe") {
return [ "prune", "exclude" ];
}
}
return "exclude";
});
The filter function inspects the type
attribute of a vertex, and then decides what to
do with it:
-
it will not follow any connections of vertices with type
country
. It does so by returning"prune"
-
it will not follow any connections of vertices with type
continent
, except if the vertex’name
attribute contains the string"Europe"
. For such vertices, it will return the array[ "prune", "exclude" ]
, which tells the traversal to not follow any further connections of the vertex, and also to not visit the vertex. -
for everything else, including the vertex with type
continent
and nameEurope
, it will return"exclude"
, which means to follow the vertex’ connections but not call the visitor function for it.
Invoking the filter function from AQL
Let’s run a query now using the custom filter! I suggest running this query from the AQL editor in the web interface:
LET params = {
filterVertices : "myfunctions::europeFilter"
}
FOR result IN TRAVERSAL(v, e, "v/world", "inbound", params)
RETURN {
name: result.vertex.name,
type: result.vertex.type
}
As we can see, only vertices of type country
have been returned. This is because the filter
returned "exclude"
for everything but vertices of type "country"
. "exclude"
means that the
visitor will not be called for the vertex. As the default functionality of a visitor is to put the
visited vertex into the result, it is now clear why the result does only contain nodes of type
"country"
.
However, the result only contains European countries. This is because in the filter pruned all
connections for continents other than with name Europe
. As a consequence, when visiting vertices
of type continent
, the traversal only continent with the one continent node named Europe
, but
did not descend into the others.
Further uses
The above example is using the AQL function TRAVERSAL()
, which requires a vertex and
an edge collection to be specified. Custom visitors can also be used in the
GRAPH_TRAVERSAL()
AQL function, which does not require named collections but a named
graph. To make the above query work on a graph WorldGraph
and use GRAPH_TRAVERSAL()
instead of TRAVERSAL()
, change the query to:
LET params = {
filterVertices : "myfunctions::europeFilter"
}
FOR result IN GRAPH_TRAVERSAL("WorldGraph", "v/world", "inbound", params)
RETURN {
name: result.vertex.name,
type: result.vertex.type
}
Custom filters can be combined with custom visitor functions for maximum flexibility.
Author: Jan Steemann
Tags: #graph #traversal #aql