## Sunday, February 25, 2007

### Insertion Sort

Note: this is a repost from an old weblog.

Insertion Sort is another O(n2) sorting algorithm (just like Bubble Sort was) but is a little bit faster. Now, ideally, when trying to optimize by picking a new algorithm - you don't want to go from one slow algorithm to another, but I figure while I'm explaining sorting techniques - I might as well explain a bunch of them. Anyways... it turns out that insertion sort works out to be faster than Bubble Sort in the general case.

The way Insertion Sort works, is you move one item from your input dataset into an output dataset one item at a time. Take an item from the input and place it in the output at the proper position such that the output dataset is always in sorted order. Allow me to demonstrate:

Given: [5 8 2 9 6 3 7 1 0 4]

That's our input dataset. Our output dataset is so far empty. The first thing we do, is to take the first item of our input dataset (or any item, really, but we might as well grab them out of the input dataset in order, right?) and place it in the output dataset. Since the output dataset will only have 1 item this first round, there's nothing to sort. The result is:

Input: [8 2 9 6 3 7 1 0 4] Output: [5]

Now take the next input item and place it in the output, but in the proper sorted order:

Input: [2 9 6 3 7 1 0 4] Output: [5 8]

Lather. Rinse. Repeat.

Input: [9 6 3 7 1 0 4] Output: [2 5 8]
Input: [6 3 7 1 0 4] Output: [2 5 8 9]
Input: [3 7 1 0 4] Output: [2 5 6 8 9]
Input: [7 1 0 4] Output: [2 3 5 6 8 9]
Input: [1 0 4] Output: [2 3 5 6 7 8 9]
Input: [0 4] Output: [1 2 3 5 6 7 8 9]
Input: [4] Output: [0 1 2 3 5 6 7 8 9]
Input: [] Output: [0 1 2 3 4 5 6 7 8 9]

Tada! We have a sorted dataset. This is the basis of all insertion sort variants, the difference between the various insertion sorts (I only know of 2, but wouldn't be surprised if there were more) is the way in which we find the ideal place to insert the new item from the input into the output.

Before I get into the different variants, however, let me address one of the concerns you likely have thus far: if we're sorting a large-ish array (note: if we are sorting a linked list, then this isn't an issue), we don't want to have two arrays because that doubles the amount of memory we're using!

Indeed, but if you'll notice: as the output size increases, the input size decreases - together, both the input and output sizes total the same number of elements and if we iterate through the input elements in order, then we can share the same array as both the input and output arrays. Let me re-illustrate:

Output | Input
[][5 8 2 9 6 3 7 1 0 4]
[5 ][8 2 9 6 3 7 1 0 4]
[5 8 ][2 9 6 3 7 1 0 4]
[2 5 8 ][9 6 3 7 1 0 4]
...

See how we can cheat now? Visualization is the key to solving so many problems. Don't be afraid to take a pencil and paper and "draw" the problem - many times the solution will present itself.

Linear Insertion Sort

Generally when someone refers to Insertion Sort, this is the variant that they are talking about. As I was saying above, the difference between the variants is how they find the proper position in which to insert the new item into the output. As the name suggests, Linear insertion sort uses a linear search in order to find this position. Basically, this comes down to scanning through the output array starting at the first position and comparing the values: if the new item is larger than the first output value, move to the next output item and so on until the new item's value is less than the item we are comparing against in the output - once we find that, we have found the position to insert the new item.

Note: we could actually get away with using <= to minimize the number of comparisons we have to do, but then if we have multiple items with the same values, they won't remain in the same order they were in in the input, but if that doesn't bother you, then definitely use <=. Sorts that guarantee that items in the output with the same comparison values are in the same order as they were in the input are called "stable sorts". This is usually a pretty desirable trait (unless performance is more important). The lucky thing for us, though, is that in this particular case, keeping duplicate items in the same order we found them in the input, would force us to work backwards starting at the end of the output array rather than working forwards starting at the beginning of the output array which just so happens to allows us a more promising optimization. Pretty crafty, eh? I thought so...

One thing I should remind you about is that when you insert an item into the output, you'll need to shift all the items after that position one place to the right.

Okay, on to the implementation...

The first optimization you should notice is that after the first "loop", the first item of the input is always the first item of the output, so we might as well start with the second item of the input array. So with that, go ahead and implement this algorithm - you should get something similar to this:

```void
InsertionSort (int a[], int n)
{
int i, j = 0;
int key;

for (i = 1; i < n; j = i, i++) {
key = a[i];
while (j >= 0 && a[j] > key) {
a[j + 1] = a[j];
j--;
}
a[j + 1] = key;
}
}
```

You'll notice that I took the "work backwards starting at the end of the output array" approach that I noted above. As I started to mention, this approach has the added benefit of allowing us to shift the output items as we go which means we get to shift fewer items per loop on average than if we had worked in the forward direction, resulting in a performance gain (shifting a bunch of items is more expensive than an integer comparison).

If we feed this into our sort program and have main() call this new function rather than the BubbleSort() implementation we wrote yesterday, we find that we have a sort that is significantly faster - down to 36 seconds from the optimized BubbleSort()'s 51 seconds. Not bad...

See any obvious improvements we can make that might result in a significantly faster implementation of our InsertionSort() implementation? I don't see anything I would consider major, but we might try using memmove() instead of manually moving items one space to the right. We'll get the most bang for the buck if we wait until we find our optimal insertion point before moving anything around, so:

```void
InsertionSort (int a[], int n)
{
int i, j = 0;
int key;

for (i = 1; i < n; j = i, i++) {
key = a[i];
while (j >= 0 && a[j] > key)
j--;
memmove (a + (j + 2), a + (j + 1), sizeof (int) * ((i - 1) - j));
a[j + 1] = key;
}
}
```

Okay, so with that very simple adjustment, we got our total sort time from 36 seconds down to 26.5 seconds. Nothing to scoff at, surely, but at the same time not as drastic an improvement as changing algorithms had been (see why algorithms are so important for optimizations?).

Now just imagine if we used an algorithm that promised better than O(n2) performance!

Before we jump to something better than O(n2) though, I still have one more trick up my sleeve to improve our insertion sort algorithm (not implementation this time, but rather: algorithm). I dub thee: Binary Insertion Sort.