In January of this year, I worked out a simple algorithm to tackle the following problem:

Given an undirected graphG= (V,E), find all simple paths having terminal verticesvand_{s}vfor all_{t}v∈_{s}V.

Some time has elapsed, but I was finally able to put code to my paperbound scrawl. Fortunately, this problem is easily solved (though my own approach is almost certain to be far from optimal), but before proceeding to the subject of implementation, allow me to first state the problem more plainly and explain its significance.

**The Problem**

The problem, as stated, is nearly impenetrable without some basic terminology and some familiarity with a graph's common form, but we may eschew this definition in favor of one less formal: given a graph containing some number of nodes and edges, find all the paths one may take from some starting node to some target node that do not intersect themselves. Imagine several walking paths, of which a few loop back on themselves or come to dead ends; your job would be to find those that don't, and what remains may twist and wind to any extent yet permitted. The resulting paths are considered "simple," meaning they do not intersect themselves or double-back. That none of these paths fall short of the intended target is not a product of their "simplicity" but a stipulation of the stated problem.

You may already have a firm grasp of the problem at hand, but permit me the addition of a slightly confounding factor: the solution must be generalized for use in a hypergraph. As it turns out, this added requirement is hardly confounding at all, but, before I explain why this is so, I should first make clear the distinction between ordinary graphs and hypergraphs.

An ordinary graph contains a set of edges,

*E*, in which each edge is an unordered pair containing two vertices (nodes). That is, in an ordinary graph, every edge consists of two, and only two, nodes. However, in a hypergraph, (hyper)edges are permitted to connect more than two nodes, making for a more generalized structure. For our purposes, that's all there is to know.

And how does this change the problem? Whereas, before, one edge traversed equated to one node traversed, it now becomes possible to traverse an edge without having traversed all the nodes to which that edge is incident. To adapt, we need only add a single requirement: no edge may be traversed more than once. An essential behavior, once taken for granted, must now be scrupulously accounted for, but this is easily accomplished.

I hope that, by now, the problem is clear, though its importance may yet seem obscure. It is to the significance of this problem and its solution that I turn next.

**Applications**

Consider a mid-size Metro Area Network without a discernible core. Though only one or two routers may provide Internet access, there exist a sufficient number of decentralized links to produce myriad failure scenarios. How many failure scenarios? Take the number of unique paths that exist between a given metro router and the Internet gateway, then subtract one.

To determine how the routing will change for each failure scenario, we need only order the paths by the sum of their link metrics; outages along paths less preferred will not affect routing for the router of interest until all of its more preferred paths are compromised.

And what of automation? Given, instead, a preferred order of path failure, an algorithm for assigning and maintaining link metrics may be devised, thus abstracting the underlying mechanisms to a few commands of broader consequence. Inevitably, this line of reasoning leads to the idea of more intelligent networks, capable, to a limited degree, of running themselves. If this seems far fetched, consider that MPLS-TE already allows for limited decision making through the use of RSVP and IP SLA, though none of this occurs at the level of IGP. The implications really are fascinating, and it all begins with the creation of a flexible model.

**A Solution**

First, I'll explain the algorithm, as designed. On second thought, this may be too strong a word: my strategy, when creating an algorithm, is to bumble through until I achieve the result desired; I've received no formal education of any sort as would permit me to "design" much of anything. I can, however, illustrate the required operations, provide some code, and hope for the best.

Let us introduce a graph

*G*, shaping edges in such a way as to deliberately test our algorithm's mettle. Our starting node shall be

*v*= 'F', our target node,

_{s}*v*= 'E'.

_{t}A hypergraph of some peril for our little algorithm |

Step 1: Add the starting node F to a list of visited nodes. Find all edges incident to the starting node.

There is but a single edge incident to node F |

Step 2: Add the first edge found to a list of visited edges. Find all other nodes incident to the first edge.

F shares an edge with both A and B |

Continuing from A, two more possible paths are found |

Finding one of several unique paths |

**A Mess**

I now present to you my dirty laundry:

Click to expand code

package Hypergraph::Graph; use lib "/Users/jstorm/lib/perl"; use Hypergraph::Edge; use Misc; use strict; use warnings; sub new { my ( $class, $a ) = @_; my $self = { _nodes => {}, _edges => [] }; if ( ref($a) eq 'HASH' ) { $self = {%$self, %$a}; } bless $self, $class; return $self; } sub nodes { my $self = shift; return $self->{_nodes}; } sub addNode { my ( $self, $n, $a ) = @_; if ( $self->{_nodes}->{$n} ) { return; } if ( ref($a) eq 'HASH' ) { $self->{_nodes}->{$n} = {_edges=>[], $a}; } else { $self->{_nodes}->{$n} = {_edges=>[]}; } return $n; } sub edges { my $self = shift; return $self->{_edges}; } sub addEdge { my ( $self, $a ) = @_; my $e = Hypergraph::Edge->new($self, $a); if ( $e ) { push(@{$self->edges()}, $e); } return $e; } sub incidentEdges { my $self = shift; my $n = shift || return; return $self->nodes()->{$n}->{_edges}; } 1;

Click to expand code

package Hypergraph::Edge; sub new { my ( $class, $g, $a ) = @_; my $self = { graph => $g, _nodes => {} }; if ( ref($a) eq 'HASH' ) { $self = { %$self, %$a } } bless $self, $class; return $self; } sub nodes { my ( $self ) = @_; my @nodes = keys %{$self->{_nodes}}; return \@nodes; } sub addNode { my ( $self, $n, $a ) = @_; my $graph = $self->{graph}; if ( $self->{_nodes}->{$n} ) { return; } if ( ! $graph->nodes()->{$n} ) { $graph->addNode($n); } if ( ref($a) eq 'HASH' ) { $self->{_nodes}->{$n} = $a; } else { $self->{_nodes}->{$n} = {}; } push(@{$graph->nodes()->{$n}->{_edges}}, $self); return $n; } 1;

You might consider the classes, as presented, adequate, and that is what they shall remain until I have time enough to finish them. They are used, thusly:

Click to expand code

sub getGraph { # V <=> E # my $graph = Hypergraph::Graph->new(); my $e1 = $graph->addEdge( {id=>1} ); $e1->addNode('A'); $e1->addNode('F'); $e1->addNode('B'); my $e2 = $graph->addEdge( {id=>2} ); $e2->addNode('A'); $e2->addNode('C'); my $e3 = $graph->addEdge( {id=>3} ); $e3->addNode('B'); $e3->addNode('C'); my $e4 = $graph->addEdge( {id=>4} ); $e4->addNode('C'); $e4->addNode('G'); my $e5 = $graph->addEdge( {id=>5} ); $e5->addNode('B'); $e5->addNode('D'); my $e6 = $graph->addEdge( {id=>6} ); $e6->addNode('C'); $e6->addNode('E'); my $e7 = $graph->addEdge( {id=>7} ); $e7->addNode('D'); $e7->addNode('E'); my $e8 = $graph->addEdge( {id=>8} ); $e8->addNode('A'); $e8->addNode('B'); return $graph; }

Note that the "id" references are included only to illustrate how one might apply an attribute to an edge; the same could have been done for the graph object. And, finally, the algorithm itself:

Click to expand code

sub getPaths { my $g = shift || return; # Graph my $v_s = shift || return; # Starting node my $v_t = shift || return; # Target node my $_p_path = shift || [$v_s]; # Nodes traversed my $_e_path = shift || []; # Edges traversed # Edges incident to v_s that we haven't seen before my $cedges = Misc::relComp($g->incidentEdges($v_s), $_e_path); my $paths = []; # Paths found for my $e (@$cedges) { my $an = Misc::relComp($e->nodes(), $_p_path); foreach (@$an) { if ( $_ eq $v_t ) { $paths = [@$paths, [ @$_p_path, $_ ]]; next; } my $tmp = getPaths($g, $_, $v_t, [@$_p_path, $_], [@$_e_path, $e]); if ( ! $tmp ) { next; } $paths = [@$paths, @$tmp]; } } return $paths; }

Here, a function

**relComp**is used to weed out visited nodes and edges from unvisited ones. It returns the relative complement of two sets (arrays), removing elements in the second set from the first set. The code is provided for completeness:

Click to expand code

sub relComp { # Returns a reference to an array containing the relative complement # of B in A (B\A or A-B), given array references for A and B my $A = shift || []; my $B = shift || []; my @C = (); for (my $i = 0; $i <= $#$A; $i++) { my $a = $$A[$i]; if ( (grep {$a eq $_} @$B) == 0) { push(@C, $a); } } return \@C; }

Once the above code is in place, we may then use the following to put the whole in motion:

Click to expand code

my $G = getGraph; my $paths = getPaths($G, 'F', 'E'); foreach (@$paths) { print "@$_\n"; }

The output, then, is exactly what we would hope to see:

```
jstorm[0]@absinthe:Hypergraph$ ./enumpaths.pl
F A C B D E
F A C E
F A B C E
F A B D E
F B C E
F B D E
F B A C E
```

**Efficiency**

I have yet to work out the time complexities involved in the current algorithm, but the raw times are encouraging. Right now, given a particular graph containing 19 nodes and 34 edges, it is possible to enumerate all 3972 paths (containing 13 nodes or less) for all (

*v*,

_{s}*v*) pairs in ~2.6 seconds with average hardware. All right, I admit, I'm ecstatic, but I suspect quite a few improvements could be made without sacrificing clarity or generality. The graph against which these tests were run is shown below.

_{t}
Imagine a community creating math lessons, with many people contributing. Usually people only name a few immediate connections between lessons. Your software can look at those, and create learning paths.

ReplyDeleteYour article is a godsend for me. Your information will help me in writing a term paper.

ReplyDelete