There’s still lots of things to explore with regard to Silverlight/ WPF and Web Services, and I certainly have plans to do that, but I’ve got another subject to write about first.
Lately I’ve been reading up on another interesting subject: the Shortest Path Problem. This problem can be stated as follows: given a network of nodes with a number of connecting routes between them, how can one figure out the shortest path between two randomly chosen nodes?
I tried to obtain a program in C# that shows the solution as clearly as possible, but doesn’t need to be rewritten for every different set of objects for which one wants to calculate the shortest path. I'm going to discuss that program here. But I’ll try to explain the concepts first, using information I sifted out from the many pages found on the internet which deal with this theme.
Like many other programming problems, this one has already been solved during the early years of computer programming. It was solved by Edsger W. Dijkstra, one of the Dutch pioneers in computer science and without any doubt the greatest among them.
Surprisingly enough the name Dijkstra nowadays is not widely known anymore in the Dutch software development community. This despite the fact that he has a long lasting influence on the way we work now. Not only was he the first one urging for structured programming, but he also invented the semaphore concept, which is still widely used in parallel programming.
I don’t have the illusion that this blog is going to change all this, but I feel that, if a Dutch GIS blog deals with the shortest path problem, Dijkstra should be mentioned prominently.
Dijkstra’s solution was published in 1959. He designed the algorithm three years earlier while drinking a cup of coffee on an Amsterdam terrace.
Here’s how he describes his algorithm:
------------------------------------------------------------------------------------------------
“[…] Find the path of minimum total length between two given nodes P and Q.
[…]
In the course of the solution the nodes are subdivided into three sets:
A. the nodes for which the path of minimum length from P is known; nodes will be added to this set in order of increasing minimum path length from node P;
B. the nodes from which the next node to be added to set A will be selected; this set comprises all those nodes that are connected to at least one node of set A but do not yet belong to A themselves;
C. the remaining nodes.
The branches are also subdivided into three sets:
I. the branches occurring in the minimal paths from node P to the nodes in set A;
II. the branches from which the next branch to be placed in set I will be selected; one and only one branch of this set will lead to each node in set B;
III. the remaining branches (rejected or not yet considered).
To start with, all nodes are in set C and all branches are in set III. We now transfer node P to set A and from then onwards repeatedly perform the following steps.
Step 1. Consider all branches r connecting the node just transferred to set A with nodes R in sets B or C. If node R belongs to set B, we investigate whether the use of branch r gives rise to a shorter path from P to R than the known path that uses the corresponding branch in set II. If this is not so, branch r is rejected; if, however, use of branch r results in a shorter connexion between P and R than hitherto obtained, it replaces the corresponding branch in set II and the latter is rejected. If the node R belongs to set C, it is added to set B and branch r is added to set II.
Step 2. Every node in set B can be connected to node P in only one way if we restrict ourselves to branches from set I and one from set II. In this sense each node in set B has a distance from node P: the node with minimum distance from P is transferred from set B to set A, and the corresponding branch is transferred from set II to set I. We then return to step I and repeat the process until node Q is transferred to set A. Then the solution has been found.”
------------------------------------------------------------------------------------------------
The beauty of this description is that it is very compact and straightforward, but doesn’t have anything to do with an implementation in some computer language. One can imagine Dijkstra sitting on the terrace of a café and drawing the different steps on a napkin.
There’s a video showing someone doing just that. Well, not on a napkin and probably not on a terrace, but certainly very illustrative:
How to find least-cost paths in a graph using Dijkstra's Algorithm.
This video shows two other noteworthy things:
1. The term Shortest Path doesn’t always have to be taken literally. The numbers assigned to an edge (or branch, as Dijkstra calls them) do not necessarily have to be a distance. It can also be a number signifying the cost of traversing via that edge. It should be noted here that Dijkstra’s solution only works if all numbers are non-negative!
2. Direction can be important. Travelling from some node A to some node B can be possible, whereas travelling in the reverse direction can be impossible or only possible with a higher cost.
The steps described on the Wikipedia page about Dijkstra’s solution, thought not language specific, are already formulated in such a way that they can be carried out by a computer program:
------------------------------------------------------------------------------------------------
“Let the node at which we are starting be called the initial node. Let the distance of node Y be the distance from the initial node to Y. Dijkstra's algorithm will assign some initial distance values and will try to improve them step by step.
1. Assign to every node a distance value. Set it to zero for our initial node and to infinity for all other nodes.
2. Mark all nodes as unvisited. Set initial node as current.
3. For current node, consider all its unvisited neighbors and calculate their tentative distance (from the initial node). For example, if current node (A) has distance of 6, and an edge connecting it with another node (B) is 2, the distance to B through A will be 6+2=8. If this distance is less than the previously recorded distance (infinity in the beginning, zero for the initial node), overwrite the distance.
4. When we are done considering all neighbors of the current node, mark it as visited. A visited node will not be checked ever again; its distance recorded now is final and minimal.
5. If all nodes have been visited, finish. Otherwise, set the unvisited node with the smallest distance (from the initial node) as the next "current node" and continue from step 3.”
------------------------------------------------------------------------------------------------
Lots of implementations in many different languages can be found on the internet. Thus, I didn’t bother to write my own version from scratch. That would be re-inventing the wheel. Instead, I searched for a C# example showing just the basics of the algorithm. This turned out to be rather difficult, as many of the implementations add lots of bells and whistles to the basics I wanted to show. But in the end I found a very clear example on the MSDN site, originally written in 2003 by Scott Mitchell and updated by him in 2005 for C# 2.0 and using Generics. It can be found here.
I borrowed the essential parts of his code for a Visual Studio 2010 solution of my own. The reason that I did not simply use his solution was threefold: Firstly, Scott’s solution still contained a few non-Generic collection types I wanted to replace with their Generic counterparts. Secondly, in the sample code the algorithm sticks directly behind the user interface. I wanted to separate stuff a bit more to make it more reusable. And finally the algorithm works with one particular type representation of a node: a string. I wanted the algorithm to be able to work on lots of different types, thus making it still more reusable.
My solution is attached to this post.
When modeling the nodes and edges with their associated costs in a C# program (or a program in any other language for that matter) we can make use of a Graph. As Scott’s application already contained a clear and straightforward implementation of the Graph and a few dependent classes, I copied these over to a C# Class Library named RoutingTypes. The only things I changed here was a bit of cleaning up, some renaming of variables, changing an int to a double, and some extra validation of arguments. The classes were already using Generic types, so I didn’t need to change anything to let the Graph deal with all kinds of types.
The Graph class I’m using works with all kinds of .NET types, as it is defined as : public class Graph<T> : IEnumerable<T>(){…}. It contains a collection of GraphNode<T> objects. The GraphNode (which is defined as public class GraphNode<T> : Node <T>(){…}) has a collection of Node<T> objects, which contain the neighbouring Nodes, and a collection of doubles, which contain the distances to (or the cost associated with) these neighbouring Nodes.
Using these adjacency lists of neighbouring Nodes and the values associated with them is one way of storing Graph information. Another way would be to store all distance/cost values as an n-by-n matrix of doubles. Which of both alternatives is best depends on the type of Graph. In the case of a dense Graph (in which the number of edges is close to the maximal number of edges) a matrix will lead to faster performance, whereas in the case of a sparse Graph (a graph with only a few edges) adjacency lists will be fastest. Adjacency lists consume less memory in the case of sparse Graphs.
But let’s not bother too much about the possible performance issues in some cases of our implementation and turn to the rest of the code. The steps described by Dijkstra are actually implemented in another C# Class Library named Routing. The class Router<T> has a method GetShortestPath(). I’ve tried to map the steps taken there to the steps cited above from the Wikipedia page:
public class Router<T>
{
public RoutingResult<T> GetShortestPath(Graph<T> items, T from, T to)
{
Dictionary<T, double> distances = new Dictionary<T, double>();
Dictionary<T, T> routes = new Dictionary<T, T>();
// 1. Assign to every node a distance value. Set it to zero for our initial
// node and to infinity for all other nodes.
InitialiseDistancesAndRoutes(distances, routes, items);
SetInitialDistanceOfStartToZero(from, distances);
// 2. Mark all nodes as unvisited. Set initial node as current
// (done in the first call of GetNodeWithSmallestDistanceValue).
NodeList<T> nodes = items.Nodes;
while (nodes.Count > 0)
{
GraphNode<T> currentNode = GetNodeWithSmallestDistanceValue(distances, nodes);
// 3. For current node, consider all its unvisited neighbors and calculate
// their tentative distance (from the initial node).
if (currentNode.Neighbours != null)
{
for (int i = 0; i < currentNode.Neighbours.Count; i++)
{
UpdateDistanceAndRoute(distances, routes, currentNode.Value,
currentNode.Neighbours[i].Value, currentNode.Costs[i]);
}
}
// 4. When we are done considering all neighbors of the current node,
// mark it as visited.
nodes.Remove(currentNode);
// 5. If all nodes have been visited, finish. Otherwise, set the unvisited
// node with the smallest distance (from the initial node) as the next
// "current node" and continue from step 3.
}
RoutingResult<T> result = CreateRoutingResult(distances, routes, from, to);
return result;
}
private void InitialiseDistancesAndRoutes(Dictionary<T, double> distances,
Dictionary<T, T> routes, Graph<T> items)
{
foreach (T item in items)
{
distances.Add(item, Double.MaxValue);
routes.Add(item, default(T));
}
}
private void SetInitialDistanceOfStartToZero(T from, Dictionary<T, double> distances)
{
distances[from] = 0;
}
private GraphNode<T> GetNodeWithSmallestDistanceValue(Dictionary<T, double> distances,
NodeList<T> nodes)
{
double minimumDistance = double.MaxValue;
GraphNode<T> minimumDistanceNode = null;
foreach (GraphNode<T> node in nodes)
{
if ((distances[node.Value]) <= minimumDistance)
{
minimumDistance = distances[node.Value];
minimumDistanceNode = node;
}
}
return minimumDistanceNode;
}
private void UpdateDistanceAndRoute(Dictionary<T, double> distances,
Dictionary<T, T> routes, T item, T neighbour, double cost)
{
double distanceToItem = distances[item];
double distanceToNeighbour = distances[neighbour];
if (distanceToNeighbour > distanceToItem + cost)
{
distances[neighbour] = distanceToItem + cost;
routes[neighbour] = item;
}
}
private RoutingResult<T> CreateRoutingResult(Dictionary<T, double> distances,
Dictionary<T, T> routes, T from, T to)
{
RoutingResult<T> result = new RoutingResult<T> { TotalDistance = distances[to] };
Stack<T> traceBackSteps = GetRoutingItemsInReverseOrder(routes, from, to);
result.PathItems = new List<T>();
while (traceBackSteps.Count > 0)
{
result.PathItems.Add(traceBackSteps.Pop());
}
return result;
}
private Stack<T> GetRoutingItemsInReverseOrder(Dictionary<T, T> routeTable,T from, T to)
{
Stack<T> traceBackSteps = new Stack<T>();
T current = to;
traceBackSteps.Push(current);
do
{
current = routeTable[current];
traceBackSteps.Push(current);
} while (!current.Equals(from));
return traceBackSteps;
}
}
The method is relatively short because it makes use of a bunch of helper methods. I’ve tried to give these methods names that clearly describe what they do. The helper methods are also short, so hopefully they are easy to read.
The result of the method GetShortestPath() is returned in the shape of a RoutingResult<T> instance. This instance should contain the obtained minimum distance between the two Nodes concerned, as well as the series of Nodes to be traversed when travelling this distance. In case no path can be found, the returned value is Double.MaxValue.
To illustrate that one can use different types in the Router<T> class, I created a third C# Class Library containing three types: City, Vertex, and Item. They are very simple:
public class City
{
public string Name { get; set; }
public Point Position { get; set; }
}
public class Vertex
{
public string Name { get; set; }
}
public class Item
{
public string Name { get; set; }
}
These types are used in a WPF application named RoutingTester. In this application three Graphs are created; one of type City (identical to the Graph in the article by Scott Mitchell), one of type Vertex (identical to the example Graph on the Wikipedia page), and one of type Item (identical to the Graph on the video shown above). They are used in the following way:
private void ShortestRouteBetweenCities()
{
Graph<City> cityGraph = CreateCitiesTestGraph();
City from = GetFirstItem(cityGraph);
City to = GetLastItem(cityGraph);
Router<City> router = new Router<City>();
RoutingResult<City> result = router.GetShortestPath(cityGraph, from, to);
StringBuilder stringbuilder = new StringBuilder();
stringbuilder.AppendLine(String.Format("The shortest path from {0} to {1} is {2}" +
" miles and goes as follows: ", from.Name, to.Name, result.TotalDistance));
foreach (City city in result.PathItems)
{
stringbuilder.AppendLine(city.Name);
}
MessageBox.Show(stringbuilder.ToString());
}
private void ShortestRouteBetweenVertices()
{
Graph<Vertex> vertexGraph = CreateVerticesTestGraph();
Vertex from = GetFirstItem(vertexGraph);
Vertex to = GetLastItem(vertexGraph);
Router<Vertex> router = new Router<Vertex>();
RoutingResult<Vertex> result = router.GetShortestPath(vertexGraph, from, to);
StringBuilder stringbuilder = new StringBuilder();
stringbuilder.AppendLine(String.Format("The shortest path from {0} to {1} is {2}" +
" cost units and goes as follows: ", from.Name, to.Name, result.TotalDistance));
foreach (Vertex vertex in result.PathItems)
{
stringbuilder.AppendLine(vertex.Name);
}
MessageBox.Show(stringbuilder.ToString());
}
private void ShortestRouteBetweenItems()
{
Graph<Item> itemGraph = CreateItemsTestGraph();
Item from = GetFirstItem(itemGraph);
Item to = GetLastItem(itemGraph);
Router<Item> router = new Router<Item>();
RoutingResult<Item> result = router.GetShortestPath(itemGraph, from, to);
StringBuilder stringbuilder = new StringBuilder();
stringbuilder.AppendLine(String.Format("The shortest path from {0} to {1} is {2}" +
" cost units and goes as follows: ", from.Name, to.Name, result.TotalDistance));
foreach (Item item in result.PathItems)
{
stringbuilder.AppendLine(item.Name);
}
MessageBox.Show(stringbuilder.ToString());
}
The most relevant lines in these three methods are shown in red. They illustrate that the Router class can indeed be used for different types. The first two examples assume that a connection A – B also implies an equivalent connection B – A. The last example shows that the Router class can also be used for directed Graphs (i.e. Graphs where a connection A – B with a cost of 10 does not automatically imply a connection B - A with the same cost, but could have no connection B – A at all, or have a connection B – A with another cost factor).
For more details please do have a look at the solution attached to this post.
Routing.zip (24.08 kb)