“Output the answer modulo 10^9 + 7”

You might have noticed that many programming problems ask you to output the answer “modulo 1000000007 (10^9 + 7)”. In this post I’m going to discuss what this means and the right way to deal with this type of questions. I should have covered this topic earlier because questions involving this are not uncommon. Anyways, here it is…

First of all, I’d like to go through some prerequisites.

The modulo operation is the same as ‘ the remainder of the division ’. If I say a modulo b is c, it means that the remainder when a is divided by b is c. The modulo operation is represented by the ‘%’ operator in most programming languages (including C/C++/Java/Python). So, 5 % 2 = 1, 17 % 5 = 2, 7 % 9 = 7 and so on.


WHY IS MODULO NEEDED..

The largest integer data type in C/C++ is the long long int; its size is 64 bits and can store integers from ( –2^63 ) to ( +2^63 -1 ) . Integers as large as 9 X 10^18 can be stored in a long long int.

But in certain problems, for instance when calculating the number of permutations of a size n array, even this large range may prove insufficient. We know that the number of permutations of a size n array is n!. Even for a small value of n, the answer can be very large. Eg, for n=21, the answer is 21! which is about 5 x 10^19 and too large for a long long int variable to store. This makes calculating values of large factorials difficult.

So, instead of asking the exact value of the answer, the problem setters ask the answer modulo some number M; so that the answer still remain in the range that can be stored easily in a variable.

Some languages such as Java and Python offer data types that are capable of storing infinitely large numbers. But data type size is not the only problem. As the size of the number increases the time required to perform mathematical operations on them also increases.

There are certain requirements on the choice of M:
1. It should just be large enough to fit in an int data type.
2. It should be a prime number.
10^9 + 7 fits both criteria; which is why you nearly always find 10^9 + 7 in modulo type questions.
I’ve explained the logic behind the 2nd point in NOTES.

HOW TO HANDLE QUESTIONS INVOLVING MODULO:

Some basic knowledge of modulo arithmetic is required to understand this part.

A few distributive properties of modulo are as follows:
1. ( a + b ) % c = ( ( a % c ) + ( b % c ) ) % c
2. ( a * b ) % c = ( ( a % c ) * ( b % c ) ) % c
3. ( a – b ) % c = ( ( a % c ) – ( b % c ) ) % c ( see note )
4. ( a / b ) % c NOT EQUAL TO ( ( a % c ) / ( b % c ) ) % c
So, modulo is distributive over +, * and – but not / .

One observation that I’d like to make here is that the result of ( a % b ) will always be less than b.

If I were to write the code to find factorial of a number n, it would look something like this:

long long factorial(int n,int M)
{
	long long ans=1;
	while(n>=1)
	{
		ans=(ans*n)%M;
		n--;
	}
	return ans;
}

Notice that on line 6, I performed the modulo operation at EACH intermediate stage.
It doesn’t make any difference if we first multiply all numbers and then modulo it by M, or we modulo at each stage of multiplication.
( a * b * c ) % M is the same as ( ( ( a * b ) % M ) * c ) % M

But in case of computer programs, due to size of variable limitations we avoid the first approach and perform modulo M at each intermediate stage so that range overflow never occurs.
So the following approach is wrong:

long long factorial(int n,int M)//WRONG APPROACH!!!
{
	long long ans=1;
	while(n>=1)
	{
		ans=ans*n;
		n--;
	}
        ans=ans%M;
	return ans;
}

The same procedure can be followed for addition too.
( a + b + c ) % M is the same as ( ( ( a + b ) % M ) + c ) % M
Again we prefer the second way while writing programs. Perform % M every time a number is added so as to avoid overflow.

The rules are a little different for division. This is the main part of this post;
As I mentioned earlier,
( a / b ) % c NOT EQUAL TO ( ( a % c ) / ( b % c ) ) % c which means that modulo operation is not distributive over division.
The following concept is most important to find nCr (ways of selecting r objects from n objects) modulo M. (As an example, I’ve included the code to find nCr modulo M at the end of this post)
To perform division in modulo arithmetic we need to first understand the concept of modulo multiplicative inverse.

Lets go over some basics first.

The multiplicative inverse of a number y is z iff (z * y) == 1.

Dividing a number x by another number y is same as multiplying x with the multiplicative inverse of y.

x / y == x * y^(-1) == x * z (where z is multiplicative inverse of y)

In normal arithmetic, the multiplicative inverse of y is y^(-1); which will correspond to some float value. Eg. Multiplicative inverse of 5 is 0.2, of 3 is 0.333… etc.
But in modulo arithmetic the definition of multiplicative inverse of a number y is a little different. The modulo multiplicative inverse ( MMI ) of a number y is z iff (z * y) % M == 1.

Eg. if M= 7 the MMI of 4 is 2 as ( 4 * 2 ) %7 ==1,
if M=11, the MMI of 7 is 8 as ( 7 * 8 )%11 ==1,
if M=13, the MMI of 7 is 2 as ( 7 * 2 ) % 13==1.
Observe that the MMI of a number may be different for different M.
So, if we are performing modulo arithmetic in our program and we need the result of the operation x / y, we should NOT perform

z=(x/y)%M;

instead we should perform

y2=findMMI(y,M);
z=(x*y2)%M;

Now one question remains.. How to find the MMI of a number n.
The brute force approach would look something like this

int findMMI_bruteforce(int n,int M)
{
	int i=1;
	while(i<M)// we need to go only uptil M-1
	{
		if(( (long long)i * n ) % M ==1)
			return i;
		i++;
	}
	return -1;//MMI doesn't exist
}

The complexity of this approach is O(M) and as M is commonly equal to 10^9 + 7, this method is not efficient enough.

There exist two other algorithms to find MMI of a number. First is the Extended Eucledian algorithm and the second using Fermat’s Little Theorem.

If you are new to modulo arithmetic, you’ll probably not find these topics easy to understand.
These algorithms require prerequisites. If you want to read them they are very well explained on many online sources and some of you will study them in depth in your college’s algorithm course too.
I’ll keep this post simple for now and only give the code using Fermat Little Theorem.

int fast_pow(long long base, long long n,long long M) 
{
    if(n==0)
       return 1;
    if(n==1)
	return base;
    long long halfn=fast_pow(base,n/2,M);
    if(n%2==0)
        return ( halfn * halfn ) % M;
    else
        return ( ( ( halfn * halfn ) % M ) * base ) % M;
}
int findMMI_fermat(int n,int M)
{
	return fast_pow(n,M-2,M);
}

This code uses a function fast_pow() that is used to calculate the value of base^p. Its complexity is O(log p). It is a very efficient method of computing power of a number.
It is based on the fact that to find a^n, we just need to find a^(n/2) and the required answer will be a^(n/2) * a^(n/2) if n is even; and a^((n-1)/2) * a^((n-1)/2) * a if n is odd ( if n is odd n/2 == (n-1)/2 in most programming languages ).

NOTE:

1. For M to be a prime number is really important. Because if it is not a prime number then it is possible that the result of a modulo operation may become 0. Eg. if M=12 and we perform ( 8 * 3 ) % 12, we’ll get 0. But if M is prime then ( ( a % M ) * ( b % M ) ) % M can never be 0 (unless a or b == 0)
[EDIT: Remember that in programming contests, M is greater than all the other values provided. So a case like (14*3)%7 can never occur. ]

2. If M is prime then we can find MMI for any number n such that 1<=n<M
3. ( a – b ) % c = ( ( a % c ) – ( b % c ) ) %c is fine mathematically. But, while programming, don’t use

// if b&gt;a, you'll get wrong result this way 
a=(a%c);
b=(b%c);
ans = ( a - b ) % c; 

instead use

a=a%c;
b=b%c;
ans =  ( a - b + c ) % c;

% works differently with -ve numbers

IMPORTANT:
1. If n1,n2 are int type variables and M=10^9+7, then the result of ( n1 * n2 ) % M will surely be < M ( and capable of fitting in a simple int variable). BUT the value of ( n1 * n2 ) can be greater than the capacity of an int variable. Internally, first ( n1 * n2 ) is computed. So, to avoid overflow either declare n and / or m as long long int OR use explicit type casting (( long long ) n * m ) % M.

As an example, here is the code to find nCr modulo 1000000007, (0<=r<=n<=100000)


int fast_pow(long long base, long long n,long long M)
{
    if(n==0)
       return 1;
    if(n==1)
    return base;
    long long halfn=fast_pow(base,n/2,M);
    if(n%2==0)
        return ( halfn * halfn ) % M;
    else
        return ( ( ( halfn * halfn ) % M ) * base ) % M;
}
int findMMI_fermat(int n,int M)
{
    return fast_pow(n,M-2,M);
}
int main()
{
    long long fact[100001];
    fact[0]=1;
    int i=1;
    int MOD=1000000007;
    while(i&lt;=100000)
    {
        fact[i]=(fact[i-1]*i)%MOD;
        i++;
    }
    while(1)
    {
        int n,r;
        printf(&quot;Enter n: &quot;);
        scanf(&quot; %d&quot;,&amp;n);
        printf(&quot;Enter r: &quot;);
        scanf(&quot; %d&quot;,&amp;r);
        long long numerator,denominator,mmi_denominator,ans;
        //I declared these variable as long long so that there is no need to use explicit typecasting
        numerator=fact[n];
        denominator=(fact[r]*fact[n-r])%MOD;
        mmi_denominator=findMMI_fermat(denominator,MOD);
        ans=(numerator*mmi_denominator)%MOD;
        printf(&quot;%lld\n&quot;,ans);
    }
    return 0;
}
Advertisements

21 comments

    1. Here, the actual divisor and dividend are not 3 and 14.
      In modulo arithmetic, when we write (14/3)%M we actually mean to say
      (a/b)%M,
      where a is some number such that a%M == 14, b is some number such that b%M==3 and a%b==0

      Eg. lets take M=17.
      For this value of M, 14 is equivalent to {14,31,48,65,82,…}
      (because each of these numbers modulo 17 is 14)
      and 3 is equivalent to {3,20,37,54,….}
      (because each of these numbers modulo 17 is 3)

      So, (14/3) % 17 actually means (48/3) % 17, or (320/20) % 17, or (592/37) % 17 each of which is equal to 16 which is the same as (14 * MMI(3,17))%17 or (14 * 6) % 17

  1. i have calculated the nth fibonacci number using dynamic programming and storing the result in the arrays using array addition but question asks me to find the fibonacci modulo 10^9+7, how to do that ,actually i need to find the fibonaaci number corrosponding to gcd of two given numbers; here is my code :

    #include
    #include
    using namespace std;

    int findgcd(int num1,int num2)
    {
    int rem=num2%num1;
    if(rem==0)
    return num1;
    findgcd(rem,num1);

    }
    void fib(int n)
    { int sum1[100000]={0};
    int temp1[100000]={0};
    int temp2[100000]={0};
    /* Declare an array to store fibonacci numbers. */

    int i;

    /* 0th and 1st number of the series are 0 and 1*/

    temp2[0]=0;
    int*prev2=temp2;

    temp1[0]=1;
    int*sum=sum1;
    int*prev1=temp1;

    int j,x;
    for (i = 2; i <= n; i++)
    {
    int te=0;
    j=0;
    while(prev1[j]!=0||prev2[j]!=0||prev2[j+1]!=0||prev1[j+1]!=0||prev2[j+2]!=0)
    { x=prev1[j]+prev2[j];
    if(te==1)
    {prev1[j]=prev1[j]-1;te=0;}//leaned how to add using arrays
    if(x=0;num–)
    {

    printf(“%d”,sum[num]);}

    printf(“\n”);}
    else printf(“1\n”);

    }

    int main ()
    { int test,i;
    scanf(“%d”,&test);
    for(i=1;i<=test;i++)
    {int a,b;

    scanf("%d%d",&a,&b);
    int n=findgcd(a,b);

    fib(n);}

    return 0;
    }

  2. Hi Abhnav!

    Really nice blog!
    In NOTE 1 you have written “But if M is prime then ( ( a % M ) * ( b % M ) ) % M can never be 0 “. Could you please elaborate this point. Because if M=7 (prime) and a = 14 and b = 2 then (a*b)%M results in 0.

    Thanks!

    1. Thanks for pointing it out. That statement in my post is ambiguous.

      The example that you gave is valid in general context. Yes… even with a prime M, the answer can be zero.

      Actually, in my post I was talking in context of programming contests. In contests, the value of M is greater than all the other values provided. That’s why the case you specified can’t happen.

      If the problem at hand is to find (a*b)%M and M is prime, then the answer will never be zero because (a*b) can never be a multiple of M.
      But if M is non prime, say 6, then if a=3 and b=4, we’ll get a zero as answer.
      So, if M is prime it ensures that the answer will never be == 0.

      I hope this clarifies it 🙂

      1. I still don’t get it. Why can’t a*b be multiple of a prime? The use of modulo is done to make a large number small such that the value of the answer becomes less than the M. For M being a prime number, if we don’t want to get a 0 then in N%M, N(2*M).

    2. a*b can be multiple of a prime, but it can’t be multiple of M. Because M is larger than a and b. Eg, for M=29 ( a prime ), consider all possible pairs of (a,b) such that 0<=a,b<=28. None of these pairs are such that a*b is multiple of 29.

  3. This was amazing! I’ve been tearing my hair out because I couldn’t get why I produced primes for a problem, making it unsolvable, thank you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s