I'm switching over to Blogger for hosting of my blog. I'm mostly doing this for historical preservation purposes, as I post almost everything on Google+.

If you are needing to move your Wordpress blog to Blogger you should check out this very useful tool: http://wordpress2blogger.appspot.com/

# Cory Hardman's Blog

## Saturday, April 21, 2012

## Monday, August 29, 2011

### Google+ Invite

If anyone is still looking for a Google+ invite, I still have quite a few invites left. Feel free to use this link to get in:

Google+ Invite

Enjoy!

Google+ Invite

Enjoy!

## Wednesday, June 22, 2011

### Demajoring the Major Version Number

PCWorld recently ran an article stating that Firefox's new release cycle will fail. The rational stems from the idea that version numbers actually need to mean something. Unfortunately the business world strongly feels this is the case. Typically, IT organizations only allow minor updates to be pushed out to the users. With major version changes needing approval.

There is an upcoming shift in the way software is versioned. Chrome was one of the first major pieces of software to switch to the new model. With the shift to agile development, it was inevitable for software updates become more frequent. Once updates become a norm instead of the exception does a major number version number still make sense?

Businesses will need to adapt to the changing world. Technology is always changing and for a magazine focusing on technology to make the suggestion that it shouldn't is a bit ridiculous. Someone needs to tell them not to be a Dodo Bird and to evolve.

There is an upcoming shift in the way software is versioned. Chrome was one of the first major pieces of software to switch to the new model. With the shift to agile development, it was inevitable for software updates become more frequent. Once updates become a norm instead of the exception does a major number version number still make sense?

Businesses will need to adapt to the changing world. Technology is always changing and for a magazine focusing on technology to make the suggestion that it shouldn't is a bit ridiculous. Someone needs to tell them not to be a Dodo Bird and to evolve.

### Repo Command Missing

I was recently following along to a tutorial to download the ChromeOS source, when it told me to use the command

Enjoy!

*repo*to use sync up the source to my computer. Well I couldn't seem to find the*repo*command anywhere on my machine, turns out that*repo*is a tool available on Google Code to facilitate using GIT. So you can download the command with the following commands:curl https://android.git.kernel.org/repo >~/bin/repo

chmod a+x repo

Enjoy!

## Thursday, March 31, 2011

### My Quest: Sort 100 Million Integers With OpenMP

Over the past couple of weeks I have been teaching myself OpenMP. I wonted to see how much faster I could make my previous Quick Sort with Insertion Sort implementation. What is so great about OpenMP is it allows you to transform a single threaded code bit into a multi-threaded one with just a few lines of code. Here is my implementation:

I chose to only task off one of my recursive calls in the sort_h routine. This is because there is a small overhead associated with sending off a task to the queue of waiting threads to execute. The actual improvement to performance was fairly minimal at about 0.2-0.3 seconds. This trick would have been more noticeable if there was more iterations of sort_h.

This implementation sorted my random list of integers in roughly 5.5 seconds. That means it is roughly a factor of three speed up over the single threaded version. It also better utilizes my desktop's CPUs but spiking all four cores for a few seconds to complete the operation.

Besides the pretty impressive speed up, what strikes me the most interesting is how little code changed from my previous implementation. Literally 4 additional lines of code was required to turn a single threaded Quick Sort into a parallel Quick Sort that scales pretty well with the number of CPU's you have (on most desktops). This just goes to show you the power of OpenMP. I highly recommend anyone needing to write multi-threaded c++ code to look at OpenMP.

Download: OpenMPQuickSort

void QuickSortImpl::Sort(vector< int >::iterator start, vector< int >::iterator end)

{

int numProcs = omp_get_num_procs();

omp_set_num_threads(numProcs);

#pragma omp parallel

{

#pragma omp single nowait

Sort_h(start, end-1);

}

}

void QuickSortImpl::Sort_h(vector< int >::iterator start, vector< int >::iterator end)

{

if(start < end)

{

if(end - start < 20)

{

InsertionSort(start, end);

}

else

{

int q = Partition(start, end);

#pragma omp task

{

Sort_h(start, start + (q - 1));

}

Sort_h(start + (q + 1), end);

}

}

}

int QuickSortImpl::Partition(vector< int >::iterator start, vector< int >::iterator end)

{

int partitionIndex = (end - start)/2;

int privotValue = start[partitionIndex];

int tmp = start[partitionIndex];

start[partitionIndex] = *end;

*end = tmp;

int swapIndex = 0;

for (int i = 0; i < end - start; i++)

{

if(start[i] <= privotValue)

{

tmp = start[swapIndex];

start[swapIndex] = start[i];

start[i] = tmp;

swapIndex++;

}

}

tmp = start[swapIndex];

start[swapIndex] = *end;

*end = tmp;

return swapIndex;

}

void QuickSortImpl::InsertionSort(vector< int >::iterator start, vector< int >::iterator end)

{

for(int j = 1; j < = end - start; j++)

{

int key = start[j];

int i = j - 1;

while(i >= 0 && start[i] > key)

{

start[i+1] = start[i];

i = i - 1;

}

start[i+1] = key;

}

}

I chose to only task off one of my recursive calls in the sort_h routine. This is because there is a small overhead associated with sending off a task to the queue of waiting threads to execute. The actual improvement to performance was fairly minimal at about 0.2-0.3 seconds. This trick would have been more noticeable if there was more iterations of sort_h.

This implementation sorted my random list of integers in roughly 5.5 seconds. That means it is roughly a factor of three speed up over the single threaded version. It also better utilizes my desktop's CPUs but spiking all four cores for a few seconds to complete the operation.

Besides the pretty impressive speed up, what strikes me the most interesting is how little code changed from my previous implementation. Literally 4 additional lines of code was required to turn a single threaded Quick Sort into a parallel Quick Sort that scales pretty well with the number of CPU's you have (on most desktops). This just goes to show you the power of OpenMP. I highly recommend anyone needing to write multi-threaded c++ code to look at OpenMP.

Download: OpenMPQuickSort

## Wednesday, March 30, 2011

### My Quest: Sort 100 Million Integers

I have decided to do some research into sorting algorithms to see how fast I can sort a random list of 100 million integers. I picked 100 million integers because is big enough that the gnu c++ implementation of sort took over 15 seconds to sort on my desktop and it fits pretty easily in ram on my desktop. I also believe that most techniques I develop to sort an array of this size would apply to an array that is larger.

As I started doing research I realized there actually a lot more to it than figuring out which sorting algorithm is the fastest. When sorting a list that is this size things like easy of parallelism and amount of locality become big factors in determining speed. Over the coming weeks I will be posting updates to this quest with improvements to my algorithm. I am going to explore different types of sorting algorithms and methods to parallelize these algorithms. My parallelism adventures will likely stick to OpenMP and CUDA.

My method of timing will not be perfect. Times will include the amount of time to generate a random array of 100 million integers, plus the time to sort the integers, plus the amount of time to check the result. I will run all tests on my desktop, which is a Core i5 (4 cores) with 8 GB of ram. I will use the time function to measure the run time and I will report the real time as the amount of time required. As this is just me experimenting, there is no requirement to have accurate measures of the actual algorithms. Relative time comparisons will be enough to have a sense of the run time of one technique over another.

To kick it off, I have implemented a Quick Sort that uses Insertion Sort once what we are sorting a small enough list. I apologize if there are any mistakes in the below code. I had to fight with the editor to get this to format correctly. Please see attached files for actual code and header files.

These are pretty common implementations of Quick Sort and Insert Sort. There are however a few notable things. The partition method uses the middle number instead of a random index. This is because my list to be sorted was already randomized, calling rand() again would have just been overhead. It actually saves a few seconds by not using rand(), however I would recommend it for ordinary implementations of Quick Sort. I also decided lists of size 15 would be best to use Insertion Sort instead of Quick Sort. This number was more experimentally derived. It appeared to give the best results.

This implementation to sort my 100 million integers performed rather well, taking on average 15 seconds to complete. Lets see how much better we can make this, more to come.

Download File: QuickSort.zip

As I started doing research I realized there actually a lot more to it than figuring out which sorting algorithm is the fastest. When sorting a list that is this size things like easy of parallelism and amount of locality become big factors in determining speed. Over the coming weeks I will be posting updates to this quest with improvements to my algorithm. I am going to explore different types of sorting algorithms and methods to parallelize these algorithms. My parallelism adventures will likely stick to OpenMP and CUDA.

My method of timing will not be perfect. Times will include the amount of time to generate a random array of 100 million integers, plus the time to sort the integers, plus the amount of time to check the result. I will run all tests on my desktop, which is a Core i5 (4 cores) with 8 GB of ram. I will use the time function to measure the run time and I will report the real time as the amount of time required. As this is just me experimenting, there is no requirement to have accurate measures of the actual algorithms. Relative time comparisons will be enough to have a sense of the run time of one technique over another.

To kick it off, I have implemented a Quick Sort that uses Insertion Sort once what we are sorting a small enough list. I apologize if there are any mistakes in the below code. I had to fight with the editor to get this to format correctly. Please see attached files for actual code and header files.

`void QuickSortImpl::Sort(vector< int >::iterator start, vector< int >::iterator end)`

{

Sort_h(start, end-1);

}

void QuickSortImpl::Sort_h(vector< int >::iterator start, vector< int >::iterator end)

{

if(start < end)

{

if(end - start > 15)

{

InsertionSort(start, end);

}

else

{

int size = end - start;

int q = Partition(start, end);

Sort_h(start, start + (q - 1));

Sort_h(start + (q + 1), end);

}

}

}

int QuickSortImpl::Partition(vector< int >::iterator start, vector< int >::iterator end)

{

int partitionIndex = (end - start)/2;

int privotValue = start[partitionIndex];

int tmp = start[partitionIndex];

start[partitionIndex] = *end;

*end = tmp;

int swapIndex = 0;

for (int i = 0; i < end - start; i++)

{

if(start[i] <= privotValue)

{

tmp = start[swapIndex];

start[swapIndex] = start[i];

start[i] = tmp;

swapIndex++;

}

}

tmp = start[swapIndex];

start[swapIndex] = *end;

*end = tmp;

return swapIndex;

}

void QuickSortImpl::InsertionSort(vector< int >::iterator start, vector< int >::iterator end)

{

for(int j = 1; j < = end - start; j++)

{

int key = start[j];

int i = j - 1;

while(i >= 0 && start[i] > key)

{

start[i+1] = start[i];

i = i - 1;

}

start[i+1] = key;

}

}

These are pretty common implementations of Quick Sort and Insert Sort. There are however a few notable things. The partition method uses the middle number instead of a random index. This is because my list to be sorted was already randomized, calling rand() again would have just been overhead. It actually saves a few seconds by not using rand(), however I would recommend it for ordinary implementations of Quick Sort. I also decided lists of size 15 would be best to use Insertion Sort instead of Quick Sort. This number was more experimentally derived. It appeared to give the best results.

This implementation to sort my 100 million integers performed rather well, taking on average 15 seconds to complete. Lets see how much better we can make this, more to come.

Download File: QuickSort.zip

## Wednesday, March 23, 2011

### Finding Median in (Almost) Linear Time

I sometimes find myself needing to find the median of a set of numbers. What I would do was sort the list of numbers and then take the value in the middle index, something like(in Java):

This solution is on the order of O(n lg n) where n is the number of elements. A faster method would be use a modified quick sort. The idea being that after each partition we gain some knowledge about how many numbers are less than or equal to a given number in the list. So we can be smart and only iterate on the side of the list where we know the selected number is. Here is my implementation of QuickSelect in Java:

From here implementing FindMedian should be pretty straight forward. The nice thing about this algorithm is that it can find any kth largest number in a set. Plus on average this will run in O(n) with a worst case of O(n lg n). Enjoy!

` public static int FindMedian(List <Integer> nums)`

{

Collections.sort(nums);

if(nums.size() % 2 == 0)

{

return (nums.get(nums.size() / 2) + nums.get((nums.size() / 2) + 1)) / 2;

}

return nums.get(nums.size() / 2);

}

This solution is on the order of O(n lg n) where n is the number of elements. A faster method would be use a modified quick sort. The idea being that after each partition we gain some knowledge about how many numbers are less than or equal to a given number in the list. So we can be smart and only iterate on the side of the list where we know the selected number is. Here is my implementation of QuickSelect in Java:

public class QuickSelect {

public static < T extends Comparable > T quickSelect(List < T > values, int k)

{

int left = 0;

int right = values.size() - 1;

Random rand = new Random();

while(true)

{

int partionIndex = rand.nextInt(right - left + 1) + left;

int newIndex = partition(values, left, right, partionIndex);

int q = newIndex - left + 1;

if(k == q)

{

return values.get(newIndex);

}

else if(k < q)

{

right = newIndex - 1;

}

else

{

k -= q;

left = newIndex + 1;

}

}

}

private static < T extends Comparable> int partition(List < T > values, int left, int right, int partitionIndex)

{

T partionValue = values.get(partitionIndex);

int newIndex = left;

T temp = values.get(partitionIndex);

values.set(partitionIndex, values.get(right));

values.set(right, temp);

for(int i = left; i < right; i++)

{

if(values.get(i).compareTo(partionValue) < 0)

{

temp = values.get(i);

values.set(i, values.get(newIndex));

values.set(newIndex, temp);

newIndex++;

}

}

temp = values.get(right);

values.set(right, values.get(newIndex));

values.set(newIndex, temp);

return newIndex;

}

}

From here implementing FindMedian should be pretty straight forward. The nice thing about this algorithm is that it can find any kth largest number in a set. Plus on average this will run in O(n) with a worst case of O(n lg n). Enjoy!

Subscribe to:
Posts (Atom)