I think most of you already know about binary search. It is an algorithm that is used to search an element in an array. Its complexity is O(log n) as opposed to O(n) for linear search, so it’s a much better way to search an array. This time I’ll first revise binary search and then I’ll show how simple modifications in the basic binary search algorithm can produce interesting variants.

**1: BINARY SEARCH**

Assume that we have got an array of n integers, and we have to run many queries on the array. In each query, we are given a number ( named *targetnum* ) and we have to tell if the number is present in the array or not. We’ll use binary search to solve this.

The first step of this algorithm is to sort the array. At first look, it may seem that sorting will consume a lot of time and will contribute to overhead. But, remember that we have to run MANY queries on the array. This means that sorting once will give advantage for each of the many queries. So, ultimately the overhead contributed by sorting is much less than the advantage it gives.

Now, when we start searching, *targetnum* can be present anywhere in the sorted array. We take a look at the integer that is present in the middle of the array. If this middle integer is * targetnum* itself, then our search terminates. If this middle integer is greater that *targetnum*, then we can say that *targetnum* is obviously present somewhere in the first half of the array (i.e. in [first,mid) ) (this is because the array is sorted). And if the middle integer is less than * targetnum*, then *targetnum* is present somewhere in the second half of the array (i.e. in (mid,last] ).

The potential space ( i.e. the space where *targetnum* can be present ) is, therefore, reduced to half in every step.

We repeat this process until the size of the potential space becomes 1. And if we still have not yet found the *targetnum*, we can conclude that it is not present in the array.

In form of C code, this algorithm will look something like:

binarysearch() is a function that takes the array, the size of the array and the integer to be searched as arguments.

If there is even a single instance of *targetnum* in the array then 1 will be returned, otherwise 0 will be returned.

int binarysearch(int *a,int n,int targetnum) { int start=0, end=n-1, mid; while(start<=end) { mid=(start+end)/2; if(a[mid]==targetnum) // if the middle element is the required number, then search terminates { break; } else if(a[mid]>targetnum) { end=mid-1; // restrict search to first half } else if(a[mid]<targetnum) { start=mid+1; // restrict search to second half } } if(start<=end) return 1; else return 0; }

This algorithm halves the potential space in every iteration. So, it takes a maximum of log2(n) iterations to complete the search of one element, as opposed to n iterations per query for linear search.

**2,3: LOWER BOUND AND UPPER BOUND**

A variant of binary search can be used to find the lower bound and upper bound of an element (say, *targetnum* ) in a sorted array.

**Lower Bound:** The lower bound of *targetnum* is the first element in the sorted array that is not less than *targetnum*.

**Upper Bound:** The upper bound of *targetnum* is the first element in the sorted array that is greater than *targetnum*.

Explanation: It is possible that *targetnum* may have no instance, a single instance or multiple instances in the array.

If there is only one instance of *targetnum* present in the array, lower bound will be the position of that instance, and upper bound will be the position of that instance + 1.

If there are multiple instances of *targetnum* present in the array, lower bound will be the position of the first instance, and upper bound will be the position of the last instance + 1.

If there is no instance of *targetnum* in the array, lower bound and upper bound both will be the position of first integer greater than *targetnum*.

So, if the sorted array is {1,4,5,5,5,6,9,9,10}, then the lower bound and upperbound of each integer will be:

Number |
Lower Bound |
Upper Bound |

0 | 0 | 0 |

1 | 0 | 1 |

2 | 1 | 1 |

3 | 1 | 1 |

4 | 1 | 2 |

5 | 2 | 5 |

6 | 5 | 6 |

7 | 6 | 6 |

8 | 6 | 6 |

9 | 6 | 8 |

10 | 8 | 9 |

11 | 9 | 9 |

(Note that in case of 11, the lower bound and upper bound are 9, i.e. last index of array + 1. This is so because 11 is greater than all the elements of array)

The code for finding lower bound:

lowerbound() is a function that takes the array, the size of the array and the element whose lower bound has to be found as arguments and returns the lower bound of the given element.

int lowerbound(int *a,int n,int targetnum) { int first=0,last=n-1,mid; while(first<last) { mid=(first+last)/2; if(a[mid]==targetnum) { last=mid; } else if(a[mid]>targetnum) { last=mid-1; } else if(a[mid]<targetnum) { first=mid+1; } } if(a[first]>=targetnum) return first; return first+1; }

The main difference between binary search and this algorithm is when the condition **a[mid]==targetsum** is true. See line 9 of the code. In the binary search algorithm, if **a[mid]==targetnum** is true, the search ends (as the required number was found). But here, we are instead interested in the first instance of *targetnum*, so we continue our search.

If there is an instance of *targetnum* present at position *mid*, then it means that the first instance of *targetnum* must be in the interval [first,mid].

After the while loop ends, the required position is either the value in **first** or **first**+1.

The code for finding upper bound is:

upperbound() is a function that takes the array, the size of the array and the element whose upper bound has to be found as arguments and returns the upper bound of that element.

int upperbound(int *a,int n,int targetnum) { int first=0,last=n-1,mid; while(first<last) { mid=(first+last)/2; if(a[mid]==targetnum) { first=mid+1; } else if(a[mid]>targetnum) { last=mid-1; } else if(a[mid]<targetnum) { first=mid+1; } } if(last>=0 && a[last]>targetnum) return last; return last+1; }

The main difference between this algorithm and binary search is when the condition **a[mid]==targetnum** is true. Here, we are interested in the last occurrence of *targetnum*.

So, if one instance of *targetnum* is present at mid, then the last instance must be present somewhere in the interval [mid,last].

After the while loop ends, the required position is either the value in **last** or **last**+1.

**APPLICATION:**

One of the possible interesting applications of upperbound() and lowerbound() is:

If we are told to find the number of occurrences of a given integer (say, *targetnum*) in an array, we can use lowerbound() and upperbound() to find the answer.

The answer will be **upperbound(a,n,targetnum)-lowerbound(a,n,targetnum)**.

Convince yourself that this query is correct by consulting the definitions of upper bound and lower bound given earlier in this post.

This query will have time complexity O(log n) as both the functions take about log n iterations each to find the answer.

**DEEPER INSIGHT:**

1. For the ones who observed line 9 of upperbound() closely: There are two reasons for which I used first= mid+1, instead of first = mid.

Firstly, using first=mid will lead to an infinite loop in some cases (eg. when first=1 and last=2).

Secondly, we are interested in (the position of last instance of targetnum) **+ 1**, so its alright if *targetnum* itself is not present in the active range.

**4: **

There is another type of binary search; one that is not done on any array. There are situations where you need to optimize a certain parameter. You know the minimum and maximum value the parameter can take but there seems to be no other way than iterating over the possible values to find the optimal value. In certain cases, brute force iteration can be avoided and binary search can be used to solve the problem. One recent example of this type is the “PROBLEM A: ARROWS AND QUIVERS” question in ACM ICPC2013 Amritapuri site.

Discussing this topic will make this post complicated and much too lengthy. I’ll write about it in my future posts.

ACM ICPC 2013 Amritapuri questions

Solution to problem A

**QUESTIONS ON BINARY SEARCH:**

Am I A Fibonacci Number (Simple)

Codeforces:Eugeny and Play List (Easy)

Codeforces:Primes on Interval (Medium)

If you feel that some point wasn’t explained properly, please comment below and I’ll try to fix it.

Hey very nice tutorial

Ithink there is small problem in binarysearch method

binarysearch will fail on below input

a = {18};

p = 18 ;

// n =0 , start =0 , end = 1-1=0; while(start<end) fails in 1st attempt , if(start<end) fails so return 0 ;

expected output should be 1

but it will return 0

I think you can fix this error by changing condition in while and last if statement to "start<=end"

Oh, sorry about that.

It’s fixed now; thanks 🙂

welcome 🙂

one more minor modification requires in upperbound()

in last if condition it should be

if(last>=0 && a[last]>targetnum) return last;

else return last+1;

because sometime last can have -1 value after while loop completed

like a = {10,12} p = 8;

Nice catch…

That would’ve caused some really frustrating segmentation faults.