 # Visualizing Community Detection in Graphs¶

The following explores how Toyplot’s graph visualization could be used to support development of a hypothetical new community detection algorithm:

## Graph Data¶

First, we will load some sample graph data provided by Toyplot in three parts. The first part is an $$E \times 2$$ matrix containing the source and target vertices for each of $$E$$ graph edges:

:

import toyplot.data

edges, truth, assigned = toyplot.data.communities()
edges

:

array([[ 1,  2],
[ 1,  3],
[ 1,  4],
[ 2,  3],
[ 2,  4],
[ 3,  4],
[ 5,  6],
[ 5,  7],
[ 5,  8],
[ 6,  7],
[ 6,  8],
[ 7,  8],
[ 9, 10],
[ 9, 11],
[ 9, 12],
[10, 11],
[10, 12],
[11, 12],
[ 4,  5],
[ 8,  9]])


Conveniently, Toyplot can display this matrix directly as a graph using a force-directed layout, and inducing the vertex labels from the edge data:

:

toyplot.graph(edges);


## Ground Truth Communities¶

The second part of the data we loaded is a set of ground truth community assignments in the form of a $$V \times 2$$ matrix containing the vertex ID and community ID for each of $$V$$ vertices (note that in this case the assignments are already sorted in ID order, if they hadn’t been, we’d need to sort them explicitly):

:

truth

:

array([[ 1,  1],
[ 2,  1],
[ 3,  1],
[ 4,  1],
[ 5,  2],
[ 6,  2],
[ 7,  2],
[ 8,  2],
[ 9,  3],
[10,  3],
[11,  3],
[12,  3]])


We want to color each graph vertex by its community ID, so we create an appropriate categorical colormap:

:

colormap = toyplot.color.brewer.map("Set2")
colormap

:


Now we can rerender the graph, overriding the default edge color, vertex size, vertex label style, and vertex color:

:

ecolor = "lightgray"
vlstyle = {"fill":"white"}
vcolor = truth[:,1]
toyplot.graph(edges, ecolor=ecolor, vsize=20, vlstyle=vlstyle, vcolor=(vcolor, colormap));


## Assigned Communities¶

Next, we render the graph again using the third part of the loaded data: a different set of community ID values, generated using a hypothetical community detection algorithm that we are working on:

:

assigned

:

array([[ 1,  4],
[ 2,  1],
[ 3,  1],
[ 4,  1],
[ 5,  2],
[ 6,  2],
[ 7,  2],
[ 8,  2],
[ 9,  2],
[10,  2],
[11,  2],
[12,  2]])

:

vcolor = assigned[:,1]
toyplot.graph(edges, ecolor=ecolor, vsize=20, vlstyle=vlstyle, vcolor=(vcolor, colormap));


We want to compare the ground truth communities with the communities assigned by our algorithm, but there are two challenges:

• Flipping back-and-forth between the two visualizations and trying to understand what changed is extremely difficult.
• Even if the algorithm we’re testing arrives at identical community memberships across the graph, it is unlikely that it would happen to assign the same ID for each community … so even in the best case, the vertex colors won’t match the ground truth.

What we need is a way to highlight the differences between the two sets of communities, so we only have to look at a single visualization. The following is suggested by Jon Berry at Sandia National Laboratories:

## Highlighting Edge Types¶

We will render the graph using the vertex colors assigned by the algorithm, but with the edges colored using a set of rules that highlight the differences between the ground truth and assigned communities:

• Edges that are intra-community in both ground truth and assigned communities remain light gray.
• Edges that are inter-community in both ground truth and assigned communities remain light gray.
• Edges that are intra-community in the ground truth but inter-community for the assigned communities are displayed red (i.e. the assignment split up a ground truth community).
• Edges that are inter-community in the ground truth but intra-community for the assigned communities are displayed green (i.e. the assignment merged two ground-truth communities).
:

truth_map = dict(truth.tolist())
assigned_map = dict(assigned.tolist())

truth_source = [truth_map[vertex] for vertex in edges[:,0]]
truth_target = [truth_map[vertex] for vertex in edges[:,1]]

assigned_source = [assigned_map[vertex] for vertex in edges[:,0]]
assigned_target = [assigned_map[vertex] for vertex in edges[:,1]]

:

import numpy

# Default all edges to light gray.
ecolor = numpy.repeat("lightgray", len(edges))

selection = numpy.equal(truth_source, truth_target) & numpy.not_equal(assigned_source, assigned_target)
ecolor[selection] = "red"

selection = numpy.not_equal(truth_source, truth_target) & numpy.equal(assigned_source, assigned_target)
ecolor[selection] = "green"

canvas, axes, mark = toyplot.graph(edges, ecolor=ecolor, vsize=20, vlstyle=vlstyle, vcolor=(vcolor, colormap));


Now, we have a single visualization that emphasizes the differences between the two sets of community assignments, and we can clearly see the behavior of the tested algorithm: the green edge connecting vertices 8 and 9 indicate that two ground truth communities were merged together into one, while the red edges adjacent to vertex 1 show that it was “split” from its neighbors into its own community.

Now that that’s taken care of, we can create a publication-quality PDF version of the plot for incorporation into a paper:

:

import toyplot.pdf
toyplot.pdf.render(canvas, "communities.pdf")