“Output the answer modulo 10^9 + 7”

You might have noticed that many programming problems ask you to output the answer “modulo 1000000007 (10^9 + 7)”. In this post I’m going to discuss what this means and the right way to deal with this type of questions. I should have covered this topic earlier because questions involving this are not uncommon. Anyways, here it is…

First of all, I’d like to go through some prerequisites.

The modulo operation is the same as ‘ the remainder of the division ’. If I say a modulo b is c, it means that the remainder when a is divided by b is c. The modulo operation is represented by the ‘%’ operator in most programming languages (including C/C++/Java/Python). So, 5 % 2 = 1, 17 % 5 = 2, 7 % 9 = 7 and so on.


WHY IS MODULO NEEDED..

The largest integer data type in C/C++ is the long long int; its size is 64 bits and can store integers from ( –2^63 ) to ( +2^63 -1 ) . Integers as large as 9 X 10^18 can be stored in a long long int.

But in certain problems, for instance when calculating the number of permutations of a size n array, even this large range may prove insufficient. We know that the number of permutations of a size n array is n!. Even for a small value of n, the answer can be very large. Eg, for n=21, the answer is 21! which is about 5 x 10^19 and too large for a long long int variable to store. This makes calculating values of large factorials difficult.

So, instead of asking the exact value of the answer, the problem setters ask the answer modulo some number M; so that the answer still remain in the range that can be stored easily in a variable.

Some languages such as Java and Python offer data types that are capable of storing infinitely large numbers. But data type size is not the only problem. As the size of the number increases the time required to perform mathematical operations on them also increases.

There are certain requirements on the choice of M:
1. It should just be large enough to fit in an int data type.
2. It should be a prime number.
10^9 + 7 fits both criteria; which is why you nearly always find 10^9 + 7 in modulo type questions.
I’ve explained the logic behind the 2nd point in NOTES.

HOW TO HANDLE QUESTIONS INVOLVING MODULO:

Some basic knowledge of modulo arithmetic is required to understand this part.

A few distributive properties of modulo are as follows:
1. ( a + b ) % c = ( ( a % c ) + ( b % c ) ) % c
2. ( a * b ) % c = ( ( a % c ) * ( b % c ) ) % c
3. ( a – b ) % c = ( ( a % c ) – ( b % c ) ) % c ( see note )
4. ( a / b ) % c NOT EQUAL TO ( ( a % c ) / ( b % c ) ) % c
So, modulo is distributive over +, * and – but not / .

One observation that I’d like to make here is that the result of ( a % b ) will always be less than b.

If I were to write the code to find factorial of a number n, it would look something like this:

long long factorial(int n,int M)
{
	long long ans=1;
	while(n>=1)
	{
		ans=(ans*n)%M;
		n--;
	}
	return ans;
}

Notice that on line 6, I performed the modulo operation at EACH intermediate stage.
It doesn’t make any difference if we first multiply all numbers and then modulo it by M, or we modulo at each stage of multiplication.
( a * b * c ) % M is the same as ( ( ( a * b ) % M ) * c ) % M

But in case of computer programs, due to size of variable limitations we avoid the first approach and perform modulo M at each intermediate stage so that range overflow never occurs.
So the following approach is wrong:

long long factorial(int n,int M)//WRONG APPROACH!!!
{
	long long ans=1;
	while(n>=1)
	{
		ans=ans*n;
		n--;
	}
        ans=ans%M;
	return ans;
}

The same procedure can be followed for addition too.
( a + b + c ) % M is the same as ( ( ( a + b ) % M ) + c ) % M
Again we prefer the second way while writing programs. Perform % M every time a number is added so as to avoid overflow.

The rules are a little different for division. This is the main part of this post;
As I mentioned earlier,
( a / b ) % c NOT EQUAL TO ( ( a % c ) / ( b % c ) ) % c which means that modulo operation is not distributive over division.
The following concept is most important to find nCr (ways of selecting r objects from n objects) modulo M. (As an example, I’ve included the code to find nCr modulo M at the end of this post)
To perform division in modulo arithmetic we need to first understand the concept of modulo multiplicative inverse.

Lets go over some basics first.

The multiplicative inverse of a number y is z iff (z * y) == 1.

Dividing a number x by another number y is same as multiplying x with the multiplicative inverse of y.

x / y == x * y^(-1) == x * z (where z is multiplicative inverse of y)

In normal arithmetic, the multiplicative inverse of y is y^(-1); which will correspond to some float value. Eg. Multiplicative inverse of 5 is 0.2, of 3 is 0.333… etc.
But in modulo arithmetic the definition of multiplicative inverse of a number y is a little different. The modulo multiplicative inverse ( MMI ) of a number y is z iff (z * y) % M == 1.

Eg. if M= 7 the MMI of 4 is 2 as ( 4 * 2 ) %7 ==1,
if M=11, the MMI of 7 is 8 as ( 7 * 8 )%11 ==1,
if M=13, the MMI of 7 is 2 as ( 7 * 2 ) % 13==1.
Observe that the MMI of a number may be different for different M.
So, if we are performing modulo arithmetic in our program and we need the result of the operation x / y, we should NOT perform

z=(x/y)%M;

instead we should perform

y2=findMMI(y,M);
z=(x*y2)%M;

Now one question remains.. How to find the MMI of a number n.
The brute force approach would look something like this

int findMMI_bruteforce(int n,int M)
{
	int i=1;
	while(i<M)// we need to go only uptil M-1
	{
		if(( (long long)i * n ) % M ==1)
			return i;
		i++;
	}
	return -1;//MMI doesn't exist
}

The complexity of this approach is O(M) and as M is commonly equal to 10^9 + 7, this method is not efficient enough.

There exist two other algorithms to find MMI of a number. First is the Extended Eucledian algorithm and the second using Fermat’s Little Theorem.

If you are new to modulo arithmetic, you’ll probably not find these topics easy to understand.
These algorithms require prerequisites. If you want to read them they are very well explained on many online sources and some of you will study them in depth in your college’s algorithm course too.
I’ll keep this post simple for now and only give the code using Fermat Little Theorem.

int fast_pow(long long base, long long n,long long M) 
{
    if(n==0)
       return 1;
    if(n==1)
	return base;
    long long halfn=fast_pow(base,n/2,M);
    if(n%2==0)
        return ( halfn * halfn ) % M;
    else
        return ( ( ( halfn * halfn ) % M ) * base ) % M;
}
int findMMI_fermat(int n,int M)
{
	return fast_pow(n,M-2,M);
}

This code uses a function fast_pow() that is used to calculate the value of base^p. Its complexity is O(log p). It is a very efficient method of computing power of a number.
It is based on the fact that to find a^n, we just need to find a^(n/2) and the required answer will be a^(n/2) * a^(n/2) if n is even; and a^((n-1)/2) * a^((n-1)/2) * a if n is odd ( if n is odd n/2 == (n-1)/2 in most programming languages ).

NOTE:

1. For M to be a prime number is really important. Because if it is not a prime number then it is possible that the result of a modulo operation may become 0. Eg. if M=12 and we perform ( 8 * 3 ) % 12, we’ll get 0. But if M is prime then ( ( a % M ) * ( b % M ) ) % M can never be 0 (unless a or b == 0)
[EDIT: Remember that in programming contests, M is greater than all the other values provided. So a case like (14*3)%7 can never occur. ]

2. If M is prime then we can find MMI for any number n such that 1<=n<M
3. ( a – b ) % c = ( ( a % c ) – ( b % c ) ) %c is fine mathematically. But, while programming, don’t use

// if b&gt;a, you'll get wrong result this way 
a=(a%c);
b=(b%c);
ans = ( a - b ) % c; 

instead use

a=a%c;
b=b%c;
ans =  ( a - b + c ) % c;

% works differently with -ve numbers

IMPORTANT:
1. If n1,n2 are int type variables and M=10^9+7, then the result of ( n1 * n2 ) % M will surely be < M ( and capable of fitting in a simple int variable). BUT the value of ( n1 * n2 ) can be greater than the capacity of an int variable. Internally, first ( n1 * n2 ) is computed. So, to avoid overflow either declare n and / or m as long long int OR use explicit type casting (( long long ) n * m ) % M.

As an example, here is the code to find nCr modulo 1000000007, (0<=r<=n<=100000)


int fast_pow(long long base, long long n,long long M)
{
    if(n==0)
       return 1;
    if(n==1)
    return base;
    long long halfn=fast_pow(base,n/2,M);
    if(n%2==0)
        return ( halfn * halfn ) % M;
    else
        return ( ( ( halfn * halfn ) % M ) * base ) % M;
}
int findMMI_fermat(int n,int M)
{
    return fast_pow(n,M-2,M);
}
int main()
{
    long long fact[100001];
    fact[0]=1;
    int i=1;
    int MOD=1000000007;
    while(i&lt;=100000)
    {
        fact[i]=(fact[i-1]*i)%MOD;
        i++;
    }
    while(1)
    {
        int n,r;
        printf(&quot;Enter n: &quot;);
        scanf(&quot; %d&quot;,&amp;n);
        printf(&quot;Enter r: &quot;);
        scanf(&quot; %d&quot;,&amp;r);
        long long numerator,denominator,mmi_denominator,ans;
        //I declared these variable as long long so that there is no need to use explicit typecasting
        numerator=fact[n];
        denominator=(fact[r]*fact[n-r])%MOD;
        mmi_denominator=findMMI_fermat(denominator,MOD);
        ans=(numerator*mmi_denominator)%MOD;
        printf(&quot;%lld\n&quot;,ans);
    }
    return 0;
}

Feedback

I have posted a total of ten algorithm/programming related topics on CodeAccepted now. I’d like to get your feedback on my posts so far. My aim in starting this blog was that programmers who are new to competitive programming could benefit from it; I’d like to know how much successful or unsuccessful I’ve been.
Your feedback is very essential. You may choose to remain anonymous.

Click FEEDBACK FORM to view the form.

Algorithm #6: Backtracking

I think we are ready to discuss Backtracking now. Last time I posted about recursion; I hope it’ll help you with this topic.
When I told you about Bit Masking, I said that it is a brute force approach to solve the Subset Sum problem. And better algorithms than Bit Masking exist. Backtracking is one of those methods.

In my post about Bitmasking I discussed the Subset Sum problem. Subset sum problem is an NP complete problem. In a nutshell, NP complete is a set of computational problems for which no efficient solution that will give a reasonably good run time for very large test cases has yet been found. The complexity of Bitmasking is O(2^n) and it becomes useless at n>25 due to the dramatic increase in the number of subsets that need to be analysed.

In subset sum problem, we are given a set of positive numbers. We are asked if it is possible to find a subset of this set such that the sum of numbers of the selected subset is exactly m ( a positive number).
Backtracking can be viewed as an attempt to improve the Bitmasking algorithm. Just to remind you, in Bitmasking we analyse all the possible subsets of the given set to find a possible solution subset.
But in backtracking, we will intelligently reject the subsets that we know for sure will not lead to a solution.

For example, suppose we have n=5 and the set is {5,31,3,7,6}. We have to select a subset such that the sum of numbers of the selected subset is 11. Now it is obvious that no subset containing the number 31 can have a sum of 11. So, no subset involving the number 31 will lead to a solution. So, we should not consider those subsets that have 31 in them. We should not waste our time analyzing those subsets. This is the principle of Backtracking. As soon as we come to know that selecting 31 will not lead to a solution, we do not continue analysing subsets with 31 in them.

Below is the code to solve the subset sum problem using backtracking:
backtrack() is a function that takes the set of numbers ( nums[ ] ), the total number of elements ( n ), and the sum that we require ( requiredsum ).
It returns 1 if it is possible to obtain a sum of exactly requiredsum, and 0 if it is not possible to do so.

int backtrack(int nums[],int at,int n,int sumrequired)
{
    if(sumrequired==0)
        return 1;
    if(sumrequired<0)
        return 0;
    if(at==n)
        return 0;
    if(backtrack(nums,at+1,n,sumrequired-nums[at])==1)
        return 1;
    if(backtrack(nums,at+1,n,sumrequired)==1)
        return 1;
    return 0;
}

I mentioned in my post about recursion that programs implemented using recursion can be incredibly short and incredibly difficult to understand. This is an example of that!

Following are the important points about this algorithm:

1. The at variable tells which number’s fate we are deciding. If in a recursive instance at is i, it means that we are currently deciding the fate of the ith number.
2. If the sumrequired variable is 0, it means that we do not need to select any more numbers, so we return the success code (which is 1).
3. A negative sum of numbers can never be achieved because all numbers can have only positive values; so if sumrequired is negative, we return the failure code (which is 0).
4. The recursive call at line 9 represents the case where we select the current number ( nums[at] ). When we select the current number, our problem reduces to selecting numbers from the set {nums[at+1], nums[at+2],…, nums[n-1]} such that their sum is (requiredsum – value[at]) .
5. The recursive call at line 11 represents the case where we don’t select the current number ( nums[at] ). If we don’t select the current number, our problem reduces to selecting numbers from the set {nums[at+1], nums[at+2],…, nums[n-1]} such that their sum is requiredsum itself.

I defined in my previous post that recursion is when we decompose a problem into a smaller problem.
See points 4 and 5 carefully. I am reducing my current problem of finding a solution subset from the set {nums[at],nums[at+1],nums[at+2],…,nums[n-1]} to finding a solution subset from the set {nums[at+1], nums[at+2],…, nums[n-1]}.
Whenever I find a solution (i.e. when requiredsum =0), all recursive calls will terminate one by one and 1 will be returned at each level.

At any time if requiredsum becomes less than 0, it means that the numbers that we have selected till now have their sum greater than the initial requiredsum already . So there’s no point in going furthur. Therefore, we return 0; which denotes that the current choices of elements will not lead to any solution.

In the test case I discussed before (with n=5 and the set of numbers as {5,31,3,7,6} and requiredsum=11),
if we select the number 31, the value of requiredsum will become < 0 in the next recursive call. And that recursive instance will immediately return 0; ( see point 2 )it will not go on deciding the fate of the rest of the numbers. In this way we simply discarded the subsets including the number 31.

The worst case complexity of this algorithm is same as Bitmasking i.e. O(2^n). But it offers a better execution time as it deliberately and intelligently skips some subsets.

PROBLEMS:
PAYING UP (on Codechef)

As it is with all recursive solutions, the code may seem overwhelming at first. I suggest that you look at the order of recursive calls I showed in the last post ( here ) and try to trace the way recursive calls are made in this case.

Revising Recursion

I was going to write about Backtracking this time. But then I thought that it would be better to post about recursion first as it is a very important weapon. Algorithms such as Backtracking, Depth First Search, Disjoint Set Union (also known as Union Find), Segment trees and many other graph algorithms are best implemented using recursion.

Recursion is when a problem is defined in terms of smaller instances of the same problem. A very well known example is the factorial of a number. We can define factorial of a number in two ways.
First is the iterative way,
factorial(n)= product(i=1 to n) ( i )

The second way is the recursive way,
factorial(n)=n*factorial(n-1) .

Of course, both the definitions lead to the same values. In the recursive definition, the factorial of n is shown in terms of factorial of (n-1) (i.e. in terms of a smaller factorial problem).

Programs implemented using recursion can be incredibly short and at the same time incredibly difficult to understand. So it’s essential that some properties of recursion are understood properly before starting the algorithms I mentioned in the start of my post.

Let’s look at the code for factorial of a positive number, n. The code is pretty straight forward.

long long factorial(long long n)
{
  if(n==0)
    return 1;
  long long a=factorial(n-1);
  return n*a;
}

A recursive function is one that can call itself.

When a function is called, we say that an instance of that function is created.

Here, factorial() is a recursive function. Note that, it has a call to itself in its body. factorial() is capable of calling itself, and the call at line 5 is called a recursive call.

When a function calls itself, a different instance of that function is created whose variables are completely different than the calling function. Eg. if I want to find 4!, I will call factorial(4). factorial(4) will in turn call factorial(3). The variables ‘n’ and ‘a’ are different for both these instances of factorial(), i.e. the variables have different locations in memory and store different values. For the first one n=4 and for the second one n=3.

All instances of factorial() will execute completely independent of each other. One instance of factorial doesn’t care about the value of ‘n’ or any other variable in other instances of factorial().

For an instance of a recursive function to terminate, all its child instances should first terminate. Eg. For factorial(4) to terminate, first its child instance i.e. factorial(3) should terminate.

recursion_factorial

Recursion depth:

An instance of a recursive function is said to be at depth d, if there are d instances of that function present above it.
In the previous example, factorial(4) is at depth 0, factorial(3) is at depth 1, factorial(2) is at depth 2, factorial(1) is at depth 3 and factorial(0) is at depth 4.

Consider the following code snippet:

void func(int depth)
{
  if(depth==10)
  {
    printf("reached depth of %d, now returning back",depth);
      return;
  }
  printf("at depth %d\n",depth);
  func(depth+1);
  printf("returning from depth of %d\n",depth);
}
int main()
{
  func(0);
  return 0;
}

In this code, we are limiting the maximum depth to which func() can be recursively called to 10.
Keep in mind that each instance of func() has different ‘int depth’ variable.
Notice how we are passing an incremented value of depth in the recursive function call func(depth+1).
This is a very common technique to increment values while calling recursively.

An infinite recursion is also possible which is just like an infinite loop. Suppose I forgot to increment the depth while passing it:

void func(int depth)
{
  if(depth==10)
  {
    printf("reached depth of %d, now returning back",depth);
      return;
  }
  printf("at depth %d\n",depth);
  func(depth);//notice the lack of +1 here
  printf("returning from depth of %d\n",depth);
}
int main()
{
  func(0);
  return 0;
}

This code will run until eternity as the condition that ends the recursion is never met.
Okay, so this covers the basic terminology and few techniques.

As an advanced example, lets write a program to find all length n permutations of the digits 1,2,3 using recursion.

int permutationcount=0;
void permute(int atdepth, int n, int selections[])
{
  if(atdepth==n)
  {
    int i;
    printf(“Permutation %d: “,permutationcount);
    for(i=0;i<n;i++)
      printf(“%d “,selections[i]);
    printf(“\n”);
    permutationcount++;
    return;
  }
  selections[atdepth]=1;
  permute(atdepth+1,n,selections); // select 1 for current position
  selections[atdepth]=2;
  permute(atdepth+1,n,selections); // select 2 for current position
  selections[atdepth]=3;
  permute(atdepth+1,n,selections); // select 3 for current position
  return;
}
int main()
{
  int a[1000],n;
  scanf(" %d",&n);
  permute(0,n,a);
  return 0;
}

See the following image to understand the order of function calls:
recursion_permute

A few important observations:
1. An instance of the recursive function permute() at depth d sets the value of selections[d], i.e. the (d+1)th position in the final permutation. And then calls permute() again to compute the rest of the permutation.
2. Each instance of permute() contains three recursive calls within it. Each one represents a different selection of number for the current position.
3. When a depth of n is reached, n numbers have been set in the selections[] array. And the array is printed.
4. A recursion depth of n is reached 3^n times. Because there are 3^n permutations possible.
5. Global variables are preserved among all the recursive instances of permute(). The variable permutationcount is used to keep track of the number of permutations.

For a particular instance of permute(), the call at line 17 will be executed only when the call at line 15 (along with all the subsequent calls that arise from inside it) complete execution.
And the call at line 19 will be executed only when the calls at lines 15 and 17 (along with all their respective subsequent calls) complete execution.

Recursion may seem overwhelming at first and it may take time for you to get used to it. But once you get the hang of it, there’s nothing more wonderfully magic in programming than recursive functions. Don’t give up on them just because they seem difficult at first sight.

Algorithm #5: Binary Search and its relatives

I think most of you already know about binary search. It is an algorithm that is used to search an element in an array. Its complexity is O(log n) as opposed to O(n) for linear search, so it’s a much better way to search an array. This time I’ll first revise binary search and then I’ll show how simple modifications in the basic binary search algorithm can produce interesting variants.

1:  BINARY SEARCH

Assume that we have got an array of n integers, and we have to run many queries on the array. In each query, we are given a number ( named targetnum ) and we have to tell if the number is present in the array or not. We’ll use binary search to solve this.

The first step of this algorithm is to sort the array. At first look, it may seem that sorting will consume a lot of time and will contribute to overhead. But, remember that we have to run MANY queries on the array. This means that sorting once will give advantage for each of the many queries. So, ultimately the overhead contributed by sorting is much less than the advantage it gives.

Now, when we start searching, targetnum can be present anywhere in the sorted array. We take a look at the integer that is present in the middle of the array. If this middle integer is targetnum itself, then our search terminates. If this middle integer is greater that targetnum, then we can say that targetnum is obviously present somewhere in the first half of the array (i.e. in [first,mid) ) (this is because the array is sorted). And if the middle integer is less than targetnum, then targetnum is present somewhere in the second half of the array (i.e. in (mid,last] ).

The potential space ( i.e. the space where targetnum can be present ) is, therefore, reduced to half in every step.

We repeat this process until the size of the potential space becomes 1. And if we still have not yet found the targetnum, we can conclude that it is not present in the array.

In form of C code, this algorithm will look something like:
binarysearch() is a function that takes the array, the size of the array and the integer to be searched as arguments.
If there is even a single instance of targetnum in the array then 1 will be returned, otherwise 0 will be returned.

int binarysearch(int *a,int n,int targetnum)
{
  int start=0, end=n-1, mid; 
  while(start<=end)
  {
    mid=(start+end)/2;
    if(a[mid]==targetnum) // if the middle element is the required number, then search terminates
    {
      break;
    }
    else if(a[mid]>targetnum)
    {
      end=mid-1; // restrict search to first half
    }
    else if(a[mid]<targetnum)
    {
      start=mid+1; // restrict search to second half
    }
  }
  if(start<=end)
    return 1;
  else
    return 0;
}

This algorithm halves the potential space in every iteration. So, it takes a maximum of log2(n) iterations to complete the search of one element, as opposed to n iterations per query for linear search.

2,3: LOWER BOUND AND UPPER BOUND

A variant of binary search can be used to find the lower bound and upper bound of an element (say, targetnum ) in a sorted array.

Lower Bound: The lower bound of targetnum is the first element in the sorted array that is not less than targetnum.

Upper Bound: The upper bound of targetnum is the first element in the sorted array that is greater than targetnum.

Explanation: It is possible that targetnum may have no instance, a single instance or multiple instances in the array.
If there is only one instance of targetnum present in the array, lower bound will be the position of that instance, and upper bound will be the position of that instance + 1.

If there are multiple instances of targetnum present in the array, lower bound will be the position of the first instance, and upper bound will be the position of the last instance + 1.

If there is no instance of targetnum in the array, lower bound and upper bound both will be the position of first integer greater than targetnum.

So, if the sorted array is {1,4,5,5,5,6,9,9,10}, then the lower bound and upperbound of each integer will be:

Number Lower Bound Upper Bound
0 0 0
1 0 1
2 1 1
3 1 1
4 1 2
5 2 5
6 5 6
7 6 6
8 6 6
9 6 8
10 8 9
11 9 9

(Note that in case of 11, the lower bound and upper bound are 9, i.e. last index of array + 1. This is so because 11 is greater than all the elements of array)
The code for finding lower bound:
lowerbound() is a function that takes the array, the size of the array and the element whose lower bound has to be found as arguments and returns the lower bound of the given element.

int lowerbound(int *a,int n,int targetnum)
{
    int first=0,last=n-1,mid;
    while(first<last)
    {
        mid=(first+last)/2;
        if(a[mid]==targetnum)
        {
            last=mid;
        }
        else if(a[mid]>targetnum)
        {
            last=mid-1;
        }
        else if(a[mid]<targetnum)
        {
            first=mid+1;
        }  
    }
    if(a[first]>=targetnum)
        return first;
    return first+1;
}

The main difference between binary search and this algorithm is when the condition a[mid]==targetsum is true. See line 9 of the code. In the binary search algorithm, if a[mid]==targetnum is true, the search ends (as the required number was found). But here, we are instead interested in the first instance of targetnum, so we continue our search.
If there is an instance of targetnum present at position mid, then it means that the first instance of targetnum must be in the interval [first,mid].

After the while loop ends, the required position is either the value in first or first+1.

The code for finding upper bound is:
upperbound() is a function that takes the array, the size of the array and the element whose upper bound has to be found as arguments and returns the upper bound of that element.

int upperbound(int *a,int n,int targetnum)
{
    int first=0,last=n-1,mid;
    while(first<last)
    {
        mid=(first+last)/2;
        if(a[mid]==targetnum)
        {
            first=mid+1;
        }
        else if(a[mid]>targetnum)
        {
            last=mid-1;
        }
        else if(a[mid]<targetnum)
        {
            first=mid+1;
        }
    }
    if(last>=0 && a[last]>targetnum)
        return last;
    return last+1;
}

The main difference between this algorithm and binary search is when the condition a[mid]==targetnum is true. Here, we are interested in the last occurrence of targetnum.
So, if one instance of targetnum is present at mid, then the last instance must be present somewhere in the interval [mid,last].

After the while loop ends, the required position is either the value in last or last+1.

APPLICATION:
One of the possible interesting applications of upperbound() and lowerbound() is:
If we are told to find the number of occurrences of a given integer (say, targetnum) in an array, we can use lowerbound() and upperbound() to find the answer.
The answer will be upperbound(a,n,targetnum)-lowerbound(a,n,targetnum).
Convince yourself that this query is correct by consulting the definitions of upper bound and lower bound given earlier in this post.
This query will have time complexity O(log n) as both the functions take about log n iterations each to find the answer.

DEEPER INSIGHT:
1. For the ones who observed line 9 of upperbound() closely: There are two reasons for which I used first= mid+1, instead of first = mid.
Firstly, using first=mid will lead to an infinite loop in some cases (eg. when first=1 and last=2).
Secondly, we are interested in (the position of last instance of targetnum) + 1, so its alright if targetnum itself is not present in the active range.

4:
There is another type of binary search; one that is not done on any array. There are situations where you need to optimize a certain parameter. You know the minimum and maximum value the parameter can take but there seems to be no other way than iterating over the possible values to find the optimal value. In certain cases, brute force iteration can be avoided and binary search can be used to solve the problem. One recent example of this type is the “PROBLEM A: ARROWS AND QUIVERS” question in ACM ICPC2013 Amritapuri site.
Discussing this topic will make this post complicated and much too lengthy. I’ll write about it in my future posts.
ACM ICPC 2013 Amritapuri questions
Solution to problem A

QUESTIONS ON BINARY SEARCH:

Am I A Fibonacci Number (Simple)
Codeforces:Eugeny and Play List (Easy)
Codeforces:Primes on Interval (Medium)

If you feel that some point wasn’t explained properly, please comment below and I’ll try to fix it.

Algorithm #4: Bit Masking

First of all, I’d like to apologize for not having posted anything at all since so many weeks. I’ll try to be fairly regular from now.

The topic that I am going to discuss now is called Bit Masking. This technique is useful when you have a set and have to select a subset from it that satisfies a particular condition.  The following would be a nice example. It is called the ‘Subset Sum Problem’.

You have n coins with you; each coin has a particular value. You want to select some coins such that the sum of value of the selected coins is m.

So, each coin can have one of two fates: either it is selected or it is not. Lets solve this problem for n=4, m=10 and the value of coins as 2,3,5,7. There are a total of 2^4=16 subsets possible. In other words, there are 16 ways of selecting coins.

Let me draw your attention to an interesting fact. When you count from 0 to 15 in binary, you obtain all the possible length 4 permutations of 0s and 1s. We shall use this fact to generate all the subsets of {2,3,5,7}. Each of the numbers from 0 to 15 will represent a different subset of the set {2,3,5,7}. For instance, the number 14 (1110 in binary) represents the subset {2,3,5}, the number 10 (or 1010 in binary) represents the subset {2,5}, and 15 (1111 in binary) represents {2,3,5,7} and 0 (or 0000 ) represents {}. So, if the ith bit in the binary representation is 1, then the ith number in the set is selected otherwise it is rejected.

Just for clarity, I’m listing below the subset that each number represents:

Initial set= {2,3,5,7}

Number Its Binary Subset it represents
0 0000 {}
1 0001 {7}
2 0010 {5}
3 0011 {5,7}
4 0100 {3}
5 0101 {3,7}
6 0110 {3,5}
7 0111 {3,5,7}
8 1000 {2}
9 1001 {2,7}
10 1010 {2,5}
11 1011 {2,5,7}
12 1100 {2,3}
13 1101 {2,3,7}
14 1110 {2,3,5}
15 1111 {2,3,5,7}

So, to find the answer for the initial problem, all we have to do is iterate a variable (say, i) from 0 to 15; during each iteration find the subset that i represents and then check if it satisfies the condition.

The code will look something like

#include<stdio.h>
int main()
{
    int n,coins[10000],j;
    long long m;
    scanf(" %d %lld",&n,&m);
    long long last=(1<<n)-1; // generalizing, we have iterate from 0 to 2^n - 1
    long long i,temp,currentsum;
    i=0;
    while(i<n)
    {
        scanf(" %d",&coins[i]);
        i++;
    }
    for(i=0;i<=last;i++)
    {
        temp=i; //stored i in a different variable to save it from modification
        currentsum=0; // to keep track of the sum of values of selected coins
        j=0;
        /*-----SUBSET GENERATION PROCESS START-----*/
        printf("Checking subset : { ");
        while(j<n) // for each bit of temp
        {
            if(temp%2==1) //examining the last bit (LSB) of temp. If it is 1, we take the coin
            {
                printf("%d ",coins[j]);
                currentsum+=(coins[j]);
            }
            temp>>=1; //right shifting temp, so that we may analyze the next least significant bit in the next iteration of while loop
            j++;
        }
        /*-----SUBSET GENERATION PROCESS OVER-----*/
        printf("} Sum= %lld",currentsum);
        /*-----CONDITION EVALUATION FOR THE CURRENT SUBSET-----*/
        if(currentsum==m)
        {
            printf("Found solution subset");
            //break;
        }
        /*-----CONDITION EVALUATION OVER-----*/
        printf("\n");
    }
    return 0;
}

The body of the outer ‘for’ loop can be changed in accordance to what the problem is. In this case, we just needed the sum of numbers of each subset, so I did a running total of the value of coins selected. And when we obtain the sum of that subset, we check if it is equal to m. I’ve added printfs in the code at various places to make things clearer. Try it out by running it on your machine.

There is an obvious drawback to this method. As we are examining all possible subsets, this is a brute-force approach. If n becomes greater than 25 this method will be very slow. And, of course, we also have the limitation that we cannot have an integer larger than 64 bits in C/C++, so this method will not work for n greater than 64 ( though, as mentioned earlier, the method already becomes impractical at around n=25) .

There exists no efficient algorithm for Subset Sum problem. We may optimize our algorithm in many different ways, but a perfect solution has not yet been discovered. All algorithms that exist currently will become slower and, therefore, inefficient very quickly as n increases. Bit Masking is one of the many solutions that exist currently. There are many other, more efficient solutions (comparatively more efficient) that are quite complex to be discussed currently.

In a nutshell, this method is very easy to code and is very good for programming competitions if n is small enough. There is an alternate method very popular in competitive programming called ‘Backtracking’ which I’ll discuss in future.

If you want to practice bit masking then follow these links:

http://www.codechef.com/problems/MARCHA1 (Simple)

http://www.codechef.com/problems/LEALCO (Medium difficulty)

NOTE:
1. For those who are comfortable with the notation of complexity, the computational complexity of this algorithm is O(n*2^n)
2. Lines 26 and 27 in the code may just as well been

printf("%d ",coins[n-j-1]);
currentsum+=(coins[n-j-1]);

because it doesn’t matter if you start counting the bits from the left or from the right

I want to share a nice question that I came across on the internet. It doesn’t require knowledge of any algorithm so even beginners can attempt it.
There is an array of size n, a[n]. The objective is to construct another array of size n, lets call it b[n].
The array b should be such that b[i] (0<=i<n) contains the product of all elements of the array a, except the ith element (i.e. except a[i]).
For example, if array a is
4 3 5 1 10
then the array b will be
150 200 120 600 60
Simple, right? Here’s the catch: you cannot use the division operator. n can be very large, so your solution must be efficient enough to compute the array b in a reasonable amount of time. An O(n) solution would work.
Try it!