Month: March 2014

Representation of Graphs

Graph is a very versatile data structure. It can be used to solve a large number of problems. You’d be surprised to see how easy the problems become once they are interpreted in terms of a graph.

Before I can start with graph algorithms like DFS, BFS, Disjoint Set Union, Segment trees, etc. it is important to know how a graph can be represented in memory.

I’m assuming that you have knowledge of elementary graph theory and the concepts of node, edge, tree etc.

Two things are essential to a graph: its nodes and edges. As you know, each edge connects two nodes.
For all future references lets take N as the number of nodes and M as the number of edges. Each edge is of the form (a,b) denoting that nodes a and b are connected (1<=a,b<=N ) .
Following are two ways to store a graph. I’ll show you how to implement them:

1. Adjacency Matrix: Use a 2-D array of size N x N. Let’s call this matrix adjmat[N][N]. adjmat[i][j]=1 iff there is an edge from i to j.
For every undirected edge (a,b) in the graph, mark adjmat[a][b]=1 and adjmat[b][a]=1. If you have a directed edge from a to b, then mark only adjmat[a][b]=1.

2. Adjacency List: Use a 1-D array for each node. For every node, store in that array the nodes that it is connected to. So we’ll have N 1-D arrays, one for each node.

Lets take the following graph as example:
graph
The adjacency matrix would be:

adjmat

and the adjacency list would be:
adjlist

The adjacency list is a more efficient way to store information about a graph. It consumes lesser memory and is more time efficient as compared to adjacency matrix. The reason for this is that the adjacency matrix has one element in adjmat[][] for each possible edge in the graph i.e. N^2 possible edges. What if we have a tree? In case of a tree we have only N-1 edges. We don’t need to use N^2 amount of space just to store info about N-1 edges.
In such cases adjacency list is useful. Adjacency list can be implemented using a linked list OR dynamic array. And it consumes only that much space that is required.

Using an adjacency list also leads to lesser number of iterations in later operation (eg. during DFS, BFS, etc.). This is because the number of entries in adjacency list is 2 X M . But in case of adjacency matrix it is always N^2.

Of course in a situation where the number of edges is close to N^2, both methods will perform equally well.

In competitive programming competitions, most of the time a graph can be given as input in two ways. (Assume undirected graph)

1. The problem will give you N, and N lines will follow. In the beginning of ith line will be Di (Degree of ith node) and a list of the nodes that the ith node is connected to.
For the example graph given above, the input would look like:
5
3 2 3 4
1 1
2 1 4
2 1 3
0

2. The problem will give you N and M, and each of the next M lines will contain two integers a and b denoting an edge between nodes a and b.
For the example graph the input would be like this:
5 4
1 2
1 3
1 4
3 4

Both can be handled with either Adjacency Matrix or Adjacency List.

Type-1 with Adjacency Matrix:

int main()
{
    int n,i,j,temp;
    scanf(" %d",&n);
    int adjmat[101][101]={0};//size of this matrix should be (N+1) X (N+1) because we are using 1 based numerbing for nodes
    int degree[101]={0};
    i=1;
    while(i<=n)
    {
        scanf(" %d",&degree[i]);
        j=0;
        while(j<degree[i])
        {
            scanf(" %d",&temp);
            adjmat[i][temp]=1;
            adjmat[temp][i]=1;
            j++;
        }
        i++;
    }
    return 0;
}

Type 1 with adjacency list:

int main()
{
    int n,i,j,temp;
    scanf(" %d",&n);
    int *adjlist[101];
    int degree[101];
    i=1;
    while(i<=n)
    {
        scanf(" %d",&degree[i]);
        adjlist[i]=(int *)malloc(sizeof(int)*degree[i]);
        j=0;
        while(j<degree[i])
        {
            scanf(" %d",&temp);
            adjlist[i][j]=temp;
            j++;
        }
        i++;
    }
    return 0;
}

Type 2 with adjacency matrix:

int main()
{
    int n,m,i,a,b;
    scanf(" %d %d",&n,&m);
    int adjmat[101][101]={0};
    i=0;
    while(i<m)
    {
        scanf(" %d %d",&a,&b);
        adjmat[a][b]=1;
        adjmat[b][a]=1;
        i++;
    }
    return 0;
}

Type2 with adjacency list:

int main()
{
    int n,m,i,a,b;
    scanf(" %d %d",&n,&m);
    int adjlist[101][101]={0};
    int degree[101]={0};
    i=0;
    while(i<m)
    {
        scanf(" %d %d",&a,&b);
        adjlist[a][degree[a]]=b;
        degree[a]++;
        adjlist[b][degree[b]]=a;
        degree[b]++;
        i++;
    }
    return 0;
}

The above implementation for Type 2 with Adjacency List doesn’t use dynamic arrays ( as it would involve a lot of realloc() and would get complicated ). So it still uses a lot of memory. But the advantage of lesser iterations for later operations, as mentioned before, is still there.

If you want to use dynamic arrays, study about C++ STL Vectors. Vectors are dynamic arrays. I recommend that you read the C++ STL documentation for vectors as vectors are incredibly useful and you don’t need to worry about malloc, realloc etc. as that is handled automatically by vectors.
Follow this link. (For now, study about push_back(), size(), pop_back(), begin() and end() only)

In the next post, I’ll write about how to implement DFS and BFS.

Advertisements