I think we are ready to discuss Backtracking now. Last time I posted about recursion; I hope it’ll help you with this topic.

When I told you about Bit Masking, I said that it is a brute force approach to solve the Subset Sum problem. And better algorithms than Bit Masking exist. Backtracking is one of those methods.

In my post about Bitmasking I discussed the Subset Sum problem. Subset sum problem is an NP complete problem. In a nutshell, NP complete is a set of computational problems for which no efficient solution that will give a reasonably good run time for very large test cases has yet been found. The complexity of Bitmasking is O(2^n) and it becomes useless at n>25 due to the dramatic increase in the number of subsets that need to be analysed.

In subset sum problem, we are given a set of positive numbers. We are asked if it is possible to find a subset of this set such that the sum of numbers of the selected subset is exactly m ( a positive number).

Backtracking can be viewed as an attempt to improve the Bitmasking algorithm. Just to remind you, in Bitmasking we analyse all the possible subsets of the given set to find a possible solution subset.

But in backtracking, we will intelligently reject the subsets that we know for sure will not lead to a solution.

For example, suppose we have n=5 and the set is {5,31,3,7,6}. We have to select a subset such that the sum of numbers of the selected subset is 11. Now it is obvious that no subset containing the number 31 can have a sum of 11. So, no subset involving the number 31 will lead to a solution. So, we should not consider those subsets that have 31 in them. We should not waste our time analyzing those subsets. This is the principle of Backtracking. As soon as we come to know that selecting 31 will not lead to a solution, we do not continue analysing subsets with 31 in them.

Below is the code to solve the subset sum problem using backtracking:

*backtrack()* is a function that takes the set of numbers ( *nums[ ]* ), the total number of elements (* n* ), and the sum that we require ( *requiredsum* ).

It returns 1 if it is possible to obtain a sum of exactly *requiredsum*, and 0 if it is not possible to do so.

int backtrack(int nums[],int at,int n,int sumrequired) { if(sumrequired==0) return 1; if(sumrequired<0) return 0; if(at==n) return 0; if(backtrack(nums,at+1,n,sumrequired-nums[at])==1) return 1; if(backtrack(nums,at+1,n,sumrequired)==1) return 1; return 0; }

I mentioned in my post about recursion that programs implemented using recursion can be incredibly short and incredibly difficult to understand. This is an example of that!

Following are the important points about this algorithm:

1. The *at* variable tells which number’s fate we are deciding. If in a recursive instance *at* is i, it means that we are currently deciding the fate of the ith number.

2. If the *sumrequired* variable is 0, it means that we do not need to select any more numbers, so we return the success code (which is 1).

3. A negative sum of numbers can never be achieved because all numbers can have only positive values; so if sumrequired is negative, we return the failure code (which is 0).

4. The recursive call at line 9 represents the case where we select the current number ( *nums[at]* ). When we select the current number, our problem reduces to selecting numbers from the set {*nums[at+1]*, *nums[at+2]*,…, *nums[n-1]*} such that their sum is (** requiredsum – value[at]**) .

5. The recursive call at line 11 represents the case where we don’t select the current number (

*nums[at]*). If we don’t select the current number, our problem reduces to selecting numbers from the set {

*nums[at+1]*,

*nums[at+2]*,…,

*nums[n-1]*} such that their sum is

**itself.**

*requiredsum*I defined in my previous post that recursion is when we decompose a problem into a smaller problem.

See points 4 and 5 carefully. I am reducing my current problem of finding a solution subset from the set {nums[at],nums[at+1],nums[at+2],…,nums[n-1]} to finding a solution subset from the set {*nums[at+1]*, *nums[at+2]*,…, *nums[n-1]*}.

Whenever I find a solution (i.e. when *requiredsum* =0), all recursive calls will terminate one by one and 1 will be returned at each level.

At any time if *requiredsum* becomes less than 0, it means that the numbers that we have selected till now have their sum greater than the **initial** * requiredsum* already . So there’s no point in going furthur. Therefore, we return 0; which denotes that the current choices of elements will not lead to any solution.

In the test case I discussed before (with n=5 and the set of numbers as {5,31,3,7,6} and *requiredsum*=11),

if we select the number 31, the value of *requiredsum* will become < 0 in the next recursive call. And that recursive instance will immediately return 0; ( see point 2 )it will not go on deciding the fate of the rest of the numbers. In this way we simply discarded the subsets including the number 31.

The worst case complexity of this algorithm is same as Bitmasking i.e. O(2^n). But it offers a better execution time as it deliberately and intelligently skips some subsets.

**PROBLEMS:**

PAYING UP (on Codechef)

As it is with all recursive solutions, the code may seem overwhelming at first. I suggest that you look at the order of recursive calls I showed in the last post ( here ) and try to trace the way recursive calls are made in this case.

If my input array of numbers is sorted, how can I make the code more efficient, since if number[at] makes sum < 0, at+1 and all other numbers will make sum nagative

If the input array is sorted in ascending order, you can replace lines 9-12 with the following: