date posted: 2020-01-27
History alert! Graph data structure originated from graph theory which is branch of Mathematics. Leonhard Euler, one of great mathematician had published a paper in 1736 about Seven bridges of Konigsberg which is considered first paper of graph theory. Around that time seven bridge of Konigsberg was a notable problem in Mathematics, where the goal of the problem was to find a path that allow you to cross all bridges once and only once.
Below picture shows actual map of city of Konigsberg in Prussia, now Kaliningrad, Russia.
In Leonhard Euler's paper he had proved with mathematic rigor that it was not possible to cross each bridge once and only once.
This is the beginning of graph theory and stepping stone for other branch of mathematics such as Topology.
Graph data structure consists of two elements called vertex and edge. Two main type of graphs, which are undirected graphs where edges have no direction and directed graph where edges have a direction.
Using map of Konigsberg, bridges are considered edges and each land is considered a vertex and since you can cross each bridges in either direction this is an example of undirected graphs.
map of Konigsberg would look like picture below if it is converted to vertices and edges.
Using the picture above lets go over basic terminologies:
Starting from left, it is acyclic, undirected therefore satisfy condition to be a Forest. It is also connected thus it is both Forest and a Tree.
Second, It is Acyclic and undirected however it is not connected thus it is only a Forest.
Lastly, there exist a pentagon in the middle forming a cycle thus it does not satisfy either one.
Finally we are done with terminologies lets see how to represent graph data structures.
Adjacency listFor each vertex it has a list associated with it where elements represent vertex that it points to. In Python dictionary of lists are used to represent this, each key is a vertex and value is list of vertices that key(vertex) is pointing to (directly).
adj_list = { 1: [2, 4], 2: [5], 3: [6,5], 4: [2], 5: [4], 6: [6] }
Code above represent below picture.
Looking at vertex 1, we can see there are two edges going out to vertex 2 and 4 thus adj_list[1] = [2,4]
In undirected graph since vertex 1 contains [2,4] vertex 2 and 4 will contain 1 thus adj_list[2] = [1,4,5] and adj_list[4] = [1, 2, 5]. Note that order of vertex do not matter, adj_list[4] = [2,1,5] is okay as well.
As graph becomes dense, number of edges increase approaching |E| = |V^2| using adjacency list becomes slow since it require twice more space for edges going back and forth. For example if vertex 1 is connected to all 2, 3, 4, 5, 6 then we require twice more space since we need to create list for 2 going to 1, 3 going to 1, and so on...
For weighted graph using adjacency list take up one extra place for storing weight of edges. If edge connecting A -> B has weight of 7, store it beside the vertex
{A: [[B,7], [C, weight]]}
Taking up extra space for weights.
As graph become dense it is more efficient to use adjacency matrix.
Below matrix represents graph in above example. Vertical axis represents vertex and horizontal axis represents vertices that vertical axis' vertex points to.
Vertex 1 points to 2 and 4 thus in adjacency matrix adj_matrix[1][2] = 1 and adj_matrix[1][4] = 1. All other columns are set to 0 since vertex 1 do not point to them.
Looking at vertex 2, it only points to vertex 5 therefore adj_matrix[2][5] = 1 and all other are 0.
It is very straight forward and you can see that no matter it is directed or undirected graph its space complexity will always be V^2.
So for dense graphs adjacency matrix is more suitable and more sparse matrix using adjacency list would be suitable.
For weighted graphs replace 1 with appropriate weights