Recursion – Algorithm Study

Recursion is a technique by which a function calls itself until a condition is met.

Introduction

Loops or repetitive execution based on certain conditions are inevitable in programs. Usual loops include if, while and for loops. Recursion is an entirely different way to deal with such situations, and in many cases, easier.

Recursion is a when a function calls itself in each iteration till a condition is met. Ideally, the data set in each iteration gets smaller until it reach the required condition, after which the recursive function exists.

A typical example of recursion is a factorial function.

How does Recursion work?

A recursive function ideally contains a Base case and a Recursive case.

Recursive case is when the function calls itself, until the Base case is met. Each level of iteration in the Recursive case moves the control to the next level.

Once a specific level finishes execution, the control is passed back to the previous level of execution. A Recursive function can go several layers deep until the Base condition is met. In short, a Recursive case is a loop in which the function calls itself.

The Base case is required so that the function doesn’t continue running in the Recursive loop forever. Once the Base case is met, the control moves out of the Recursive case, executes the conditions in the Base case (if any), and exits.

As mentioned in the Introduction, a factorial function can be seen as an example of recursion.

NOTE:

The Base case for a factorial function is when n == 1

Consider n!:

n! can be written as:

n x (n – 1) x (n – 2) x (n – 3) x …. x 1

n! can also be represented as:

 n!       = n * (n - 1)! ---> [Step 1]
(n - 1)! = (n - 1) * (n - 2)! ---> [Step 2]
(n - 2)! = (n - 2) * (n - 3)! ---> [Step 3]
.
..
...
(n - (n - 1)) = 1 ---> [Base case]

Each level/step is a product of a value and all the levels below it. Hence, Step 1 will end up moving to Step 2 to get the factorial of elements below it, then to Step 3 and so on.

ie.. the control of execution move as:

[Step 1] -> [Step 2] -> [Step 3] -> ….. [Step n]

In a much easier-to-grasp example, a 5! would be:

5! = 5 * 4! ---> [Step 1]
4! = 4 * 3! ---> [Step 2]
3! = 3 * 2! ---> [Step 3]
2! = 2 * 1! ---> [Step 4]
1! = 1      ---> [Step 5] / [Base case]

The order of execution will be :

[Step 1] -> [Step 2] -> [Step 3] -> [Step 4] -> [Step 5]

As we know, in Recursion, each layer pause itself and pass the control to the next level. Once it reach the end or the Base case, it returns the result back to the previous level one by one until it reaches where it started off.

In this example, once the control of execution reaches Step 5 / Base case ,  the control is returned back to its previous level Step 4 . This level returns the result back to Step 3 which completes its execution and returns to Step 2 , so on and so forth until it reach  Step 1 .

The return control flow would be as:

[Base case / Step 5] -> [Step 4] -> [Step 3] -> [Step 2] -> [Step 1] -> Result.

This can be summed up using an awesome pictorial representation, from the book Grokking Algorithms by Adit. Please check out the References section for the link for more information about this awesome book.

factorial_recursion

Figure 1: Recursion, Recursive case and Base case (Copyright Manning Publications, drawn by adit.io)

 Code

Example 1:

  • A factorial function in a while loop
def fact(n):
factorial = 1
while n > 1:
factorial = factorial * n
n = n - 1
return factorial

print("Factorial of {0} is {1}".format(10, fact(10)))
print("Factorial of {0} is {1}".format(20, fact(20)))
  • The same function above, in a recursive loop
def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n - 1)

print("Factorial of {0} is {1}".format(10, factorial(10)))
print("Factorial of {0} is {1}".format(20, factorial(20)))

Example 2:

  • A function to sum numbers in a normal for loop.
def my_sum(my_list):
    num = 0
    for i in my_list:
        num += i
    return num

print(my_sum([10, 23, 14, 12, 11, 94, 20]))
  • The same function to add numbers, in a recursive loop
def my_sum(my_list):
    if my_list == []:
        return 0
    else:
        return my_list[0] + my_sum(my_list[1:])

print(my_sum([10, 23, 14, 12, 11, 94, 20]))

Code explanation

Both Example 1 and Example 2 are represented as an iterative function as well as a recursive function.

The iterative function calls the next() function on the iterator sum.__iter__() magic method iterate over the entire data set. The recursive function calls itself to reach a base case and return the result.

Observations:

While a recursive function does not necessarily give you an edge on performance, it is much easier to understand and the code is cleaner.

Recursion has a disadvantage though, for large data sets. Each loop is put on a call stack until it reaches a Base case. Once the Base case is met, the call stack is rewound back to reach where it started, executing each of the previous levels on the way. The examples above showed a sum function and a factorial function. In large data sets, this can lead to a large call stack which in turns take a lot of memory.

References:

  1. Grokking Algorithms
  2. Data Structures and Algorithms in Python

 

Selection Sort – Algorithm Study

Selection Sort is a sorting algorithm used to sort a data set either in incremental or decremental order.

It goes through the entire elements one by one and hence it’s not a very efficient algorithm to work on large data sets.

How does Selection sort work?

Selection sort starts with an unsorted data set. With each iteration, it builds up a sub dataset with the sorted data.

By the end of the sorting process, the sub dataset contains the entire elements in a sorted order.

1. Iterate through the data set one element at a time.
2. Find the biggest element in the data set (Append it to another if needed)
3. Reduce the sample space by the biggest element just found. The new data set becomes `n – 1` compared to the previous iteration.
4. Start over the iteration again, on the reduced sample space.
5. Continue till we have a sorted data set, either incremental or decremental

How does the data sample change in each iteration?

Consider the data set [10, 4, 9, 3, 6, 19, 8]

[10, 4, 9, 3, 6, 19, 8]      – Data set
[10, 4, 9, 3, 6, 8] – [19] – After Iteration 1
[4, 9, 3, 6, 8] – [10, 19] – After Iteration 2
[4, 3, 6, 8] – [9, 10, 19] – After Iteration 3
[4, 3, 6] – [8, 9, 10, 19] – After Iteration 4
[4, 3] – [6, 8, 9, 10, 19] – After Iteration 5
[3] – [4, 6, 8, 9, 10, 19] – After Iteration 6
[3, 4, 6, 8, 9, 10, 19]      – After Iteration 7 – Sorted data set

Let’s check what the Selection Sort algorithm has to go through in each iteration.

Dataset – [10, 4, 9, 3, 6, 19, 8]
Iter 1 – [10, 4, 9, 3, 6, 8]
Iter 2 – [4, 9, 3, 6, 8]
Iter 3 – [4, 3, 6, 8]
Iter 4 – [4, 3, 6]
Iter 5 – [4, 3]
Iter 6 – [3]

Sorted Dataset – [3, 4, 6, 8, 9, 10, 19]

Performance / Time Complexity

Selection Sort has to go through all the elements in the data set, no matter what.

Hence, the Worst case, Best case and Average Time Complexity would be O(n^2).

Since `Selection Sort` takes in `n` elements while starting, and goes through the data set `n` times (each step reducing the data set size by 1 member), the iterations would be:

n + [ (n – 1) + (n – 2) + (n – 3) + (n – 4) + … + 2 + 1 ]

We are more interested in the worse-case scenario. In a very large data set, an `n – 1`, `n – 2` etc.. won’t make a difference.

Hence we can re-write the above iterations as:

n + [n + n + n + n ….. n]

Or also as:

n * n = (n^2)

Code

def find_smallest(my_list):
    smallest = my_list[0]
    smallest_index = 0

for i in range(1, len(my_list)):
    if my_list[i] < smallest:
        smallest = my_list[i]
        smallest_index = i
return smallest_index

def selection_sort(my_list):
    new_list = []
    for i in range(len(my_list)):
        smallest = find_smallest(my_list)
        new_list.append(my_list.pop(smallest))
return new_list

Observations:

1.Selection Sort is an algorithm to sort a data set, but it is not particularly fast.
2. It takes `n` iterations in each step to find the biggest element in that iteration.
3. The next iteration has to run on a data set of `n – 1` elements compared to the previous iteration.
4. For `n` elements in a sample space, Selection Sort takes `n x n` iterations to sort the data set.

References:

1. https://en.wikipedia.org/wiki/Selection_sort
2. http://bigocheatsheet.com
3. https://github.com/egonschiele/grokking_algorithms

 

Code complexity – The Big O notation [O(n)]

Efficiency or Complexity is how well you’re using your resources to get your code run.

Efficiency can be calculated on the basis of how much time your code takes to run/execute.

Understanding the efficiency of your code can help to reduce the complexity, thus improving the runtime efficiency further. Getting the same job done in less time and less system resources is always good.

Once you find the efficiency of your program, you can start to find ways for:

  • Reducing the complexity (or increase the efficiency) which will reduce the program run time, and hence free the computer resources in a proportional rate.
  • Try to maintain a constant or reduced run time for a growing data set, which will help your program to fare well when the input size grows.

In Computer Science, the `Big O` notation is used to indicate the effieciency or complexity of a program. It is denoted by ‘O(n)’, where ‘n’ is a mathematical function to denote the input. This is

Some examples:

O(n)
O(n³)
O(n log(n))
O(√n)
O(O(n) + 1) or O(1)

Efficiency can be measures on the best, average, and worst cases. For example, consider finding a specific alphabet from a list of all the alphabets jumbled up.

  • The worst case is your program running through all 26 iterations and finding the alphabet as the last value.
  • The best case is when your program finds it in the first iteration itself.
  • The average is when your program finds the alphabet somewhere around 12-15 iterations (ie.. half of the worst case scenario).

Study of Data structures and Algorithms aim to study program complexities, and to reduce it as much as possible.

Algorithms are a series of well-defined steps you follow to solve a problem, whereas Data Structures are specific structures by which how you layout your data. Application of well-known algorithms in a program can help in reducing the run time.

More on Time complexity and the Big O notation can be read at:

https://en.wikipedia.org/wiki/Time_complexity
https://en.wikipedia.org/wiki/Big_O_notation