Google+ Invite

If anyone is still looking for a Google+ invite, I still have quite a few invites left. Feel free to use this link to get in:

Google+ Invite

Enjoy!

Demajoring the Major Version Number

PCWorld recently ran an article stating that Firefox's new release cycle will fail. The rational stems from the idea that version numbers actually need to mean something. Unfortunately the business world strongly feels this is the case. Typically, IT organizations only allow minor updates to be pushed out to the users. With major version changes needing approval.

There is an upcoming shift in the way software is versioned. Chrome was one of the first major pieces of software to switch to the new model. With the shift to agile development, it was inevitable for software updates become more frequent.  Once updates become a norm instead of the exception does a major number version number still make sense?

Businesses will need to adapt to the changing world. Technology is always changing and for a magazine focusing on technology to make the suggestion that it shouldn't is a bit ridiculous. Someone needs to tell them not to be a Dodo Bird and to evolve.

Repo Command Missing

I was recently following along to a tutorial to download the ChromeOS source, when it told me to use the command repo to use sync up the source to my computer. Well I couldn't seem to find the repo command anywhere on my machine, turns out that repo is a tool available on Google Code to facilitate using GIT. So you can download the command with the following commands:

curl https://android.git.kernel.org/repo >~/bin/repo
chmod a+x repo

Enjoy!

My Quest: Sort 100 Million Integers With OpenMP

Over the past couple of weeks I have been teaching myself OpenMP. I wonted to see how much faster I could make my previous Quick Sort with Insertion Sort implementation. What is so great about OpenMP is it allows you to transform a single threaded code bit into a multi-threaded one with just a few lines of code. Here is my implementation:

 
void QuickSortImpl::Sort(vector< int >::iterator start, vector< int >::iterator end)
{
    int numProcs = omp_get_num_procs();
    omp_set_num_threads(numProcs);
    #pragma omp parallel
    {
        #pragma omp single nowait
        Sort_h(start, end-1);
    }
}
 
void QuickSortImpl::Sort_h(vector< int >::iterator start, vector< int >::iterator end)
{
    if(start < end)
    {
        if(end - start < 20)
        {
            InsertionSort(start, end);
        }
        else
        {
            int q = Partition(start, end);
            #pragma omp task
            {
                Sort_h(start, start + (q - 1));
            }
            Sort_h(start + (q + 1), end);
        }
    }
}
 
int QuickSortImpl::Partition(vector< int >::iterator start, vector< int >::iterator end)
{
    int partitionIndex =  (end - start)/2;
    int privotValue = start[partitionIndex];
    int tmp = start[partitionIndex];
    start[partitionIndex] = *end;
    *end = tmp;
    int swapIndex = 0;
    for (int i = 0; i < end - start; i++)
    {
        if(start[i] <= privotValue)
        {
            tmp = start[swapIndex];
            start[swapIndex] = start[i];
            start[i] = tmp;
            swapIndex++;
        }
    }
    tmp = start[swapIndex];
    start[swapIndex] = *end;
    *end = tmp;
    return swapIndex;
}
void QuickSortImpl::InsertionSort(vector< int >::iterator start, vector< int >::iterator end)
{
    for(int j = 1; j < = end - start; j++)
    {
        int key = start[j];
        int i = j - 1;
        while(i >= 0 && start[i] > key)
        {
            start[i+1] = start[i];
            i = i - 1;
        }
        start[i+1] = key;
    }
}
 

I chose to only task off one of my recursive calls in the sort_h routine. This is because there is a small overhead associated with sending off a task to the queue of waiting threads to execute. The actual improvement to performance was fairly minimal at about 0.2-0.3 seconds. This trick would have been more noticeable if there was more iterations of sort_h.

This implementation sorted my random list of integers in roughly 5.5 seconds. That means it is roughly a factor of three speed up over the single threaded version. It also better utilizes my desktop's CPUs but spiking all four cores for a few seconds to complete the operation.

Besides the pretty impressive speed up, what strikes me the most interesting is how little code changed from my previous implementation. Literally 4 additional lines of code was required to turn a single threaded Quick Sort into a parallel Quick Sort that scales pretty well with the number of CPU's you have (on most desktops). This just goes to show you the power of OpenMP. I highly recommend anyone needing to write multi-threaded c++ code to look at OpenMP.

Download: OpenMPQuickSort

My Quest: Sort 100 Million Integers

I have decided to do some research into sorting algorithms to see how fast I can sort a random list of 100 million integers. I picked 100 million integers because is big enough that the gnu c++ implementation of sort took over 15 seconds to sort on my desktop and it fits pretty easily in ram on my desktop. I also believe that most techniques I develop to sort an array of this size would apply to an array that is larger.

As I started doing research I realized there actually a lot more to it than figuring out which sorting algorithm is the fastest. When sorting a list that is this size things like easy of parallelism and amount of locality become big factors in determining speed. Over the coming weeks I will be posting updates to this quest with improvements to my algorithm. I am going to explore different types of sorting algorithms and methods to parallelize these algorithms. My parallelism adventures will likely stick to OpenMP and CUDA.

My method of timing will not be perfect. Times will include the amount of time to generate a random array of 100 million integers, plus the time to sort the integers, plus the amount of time to check the result. I will run all tests on my desktop, which is a Core i5 (4 cores) with 8 GB of ram. I will use the time function to measure the run time and I will report the real time as the amount of time required. As this is just me experimenting, there is no requirement to have accurate measures of the actual algorithms. Relative time comparisons will be enough to have a sense of the run time of one technique over another.

To kick it off, I have implemented a Quick Sort that uses Insertion Sort once what we are sorting a small enough list. I apologize if there are any mistakes in the below code. I had to fight with the editor to get this to format correctly. Please see attached files for actual code and header files.

void QuickSortImpl::Sort(vector< int >::iterator start, vector< int >::iterator end)
{
    Sort_h(start, end-1);
}
void QuickSortImpl::Sort_h(vector< int >::iterator start, vector< int >::iterator end)
{
    if(start < end)
    {
        if(end - start > 15)
        {
            InsertionSort(start, end);
        }
        else
        {
            int size = end - start;
            int q = Partition(start, end);
            Sort_h(start, start + (q - 1));
            Sort_h(start + (q + 1), end);
        }
    }
}
int QuickSortImpl::Partition(vector< int >::iterator start, vector< int >::iterator end)
{
    int partitionIndex =  (end - start)/2;
    int privotValue = start[partitionIndex];
    int tmp = start[partitionIndex];
    start[partitionIndex] = *end;
    *end = tmp;
 
    int swapIndex = 0;
    for (int i = 0; i < end - start; i++)
    {
        if(start[i] <= privotValue)
        {
            tmp = start[swapIndex];
            start[swapIndex] = start[i];
            start[i] = tmp;
            swapIndex++;
        }
    }
    tmp = start[swapIndex];
    start[swapIndex] = *end;
    *end = tmp;
    return swapIndex;
}
void QuickSortImpl::InsertionSort(vector< int >::iterator start, vector< int >::iterator end)
{
    for(int j = 1; j < = end - start; j++)
    {
         int key = start[j];
         int i = j - 1;
         while(i >= 0 && start[i] > key)
         {
              start[i+1] = start[i];
              i = i - 1;
         }
         start[i+1] = key;
    }
}

These are pretty common implementations of Quick Sort and Insert Sort. There are however a few notable things. The partition method uses the middle number instead of a random index. This is because my list to be sorted was already randomized, calling rand() again would have just been overhead. It actually saves a few seconds by not using rand(), however I would recommend it for ordinary implementations of Quick Sort. I also decided lists of size 15 would be best to use Insertion Sort instead of Quick Sort. This number was more experimentally derived. It appeared to give the best results.

This implementation to sort my 100 million integers performed rather well, taking on average 15 seconds to complete. Lets see how much better we can make this, more to come.

Download File: QuickSort.zip

Finding Median in (Almost) Linear Time

I sometimes find myself needing to find the median of a set of numbers. What I would do was sort the list of numbers and then take the value in the middle index, something like(in Java):

    public static int FindMedian(List &lt;Integer&gt; nums)
    {
        Collections.sort(nums);
        if(nums.size() % 2 == 0)
        {
             return (nums.get(nums.size() / 2) + nums.get((nums.size() / 2) + 1)) / 2;
        }
        return nums.get(nums.size() / 2);
    }

This solution is on the order of O(n lg n) where n is the number of elements. A faster method would be use a modified quick sort. The idea being that after each partition we gain some knowledge about how many numbers are less than or equal to a given number in the list. So we can be smart and only iterate on the side of the list where we know the selected number is. Here is my implementation of QuickSelect in Java:

 
public class QuickSelect {
    public static < T extends Comparable <? super T>> T quickSelect(List < T > values, int k)
    {
        int left = 0;
        int right = values.size() - 1;
        Random rand = new Random();
        while(true)
        {
            int partionIndex = rand.nextInt(right - left + 1) + left;
            int newIndex = partition(values, left, right, partionIndex);
            int q = newIndex - left + 1;
            if(k == q)
            {
                return values.get(newIndex);
            }
            else if(k < q)
            {
                right = newIndex - 1;
            }
            else
            {
                k -= q;
                left = newIndex + 1;
            }
        }
    }
    private static < T extends Comparable<? super T>> int partition(List < T > values, int left, int right, int partitionIndex)
    {
        T partionValue = values.get(partitionIndex);
        int newIndex = left;
        T temp = values.get(partitionIndex);
        values.set(partitionIndex, values.get(right));
        values.set(right, temp);
        for(int i = left; i < right; i++)
        {
            if(values.get(i).compareTo(partionValue) < 0)
            {
                temp = values.get(i);
                values.set(i, values.get(newIndex));
                values.set(newIndex, temp);
                newIndex++;
            }
        }
        temp = values.get(right);
        values.set(right, values.get(newIndex));
        values.set(newIndex, temp);
        return newIndex;
    }
}
 

From here implementing FindMedian should be pretty straight forward. The nice thing about this algorithm is that it can find any kth largest number in a set. Plus on average this will run in O(n) with a worst case of O(n lg n). Enjoy!

Chrome OS – About a Month Later

I have had my Chrome OS Laptop (CR-48) for a little over a month now. Things were going well till one day when I needed to SSH into my home desktop and change something for a presentation I was about to give. That is when I realized that I am giving up a lot of power when all I carry around is my CR-48.

My presentation was saved because I was able to use my Nexus One to SSH in. It got me thinking, why should my cell phone be able to run apps locally when my laptop could not?

The cloud promises ubiquitous data access. The CR-48 does a great job of putting me into that cloud and accessing all the data that I please. The problem is, so does my desktop. I only lost power by using an operating system that could not run local apps.

I have decided to put Ubuntu onto my CR-48. I followed this guide on how to get ubuntu installed and this guide on how to get the multi-touch track pad working in Ubuntu. The tutorials are not for the faint of heart, but they are mostly complete with little to no modification of the steps required.

Ubuntu on the CR-48 runs great! I installed Chrome as my default browser (my broswer of choice anyway) and chrome feels less sluggish compare to when it was in Chrome OS. I also get the power of a complete desktop, plus with the Chrome browser I get fast access to the cloud as I did before.

I am not sure what Google should do going forward. The most powerful concept of Chrome OS is that all a computer is, is a terminal into the cloud. A user doesn’t care how much hard drive space they have or how many Ghz their CPU has. As a power user it is hard to image a world where the computer is commoditized. That is exactly what Chrome OS is trying to do. I am not sure if the average user is ready for this commoditization.

For example, my fiancee was using my CR-48 to upload some pictures to Facebook from our digital camera. She couldn’t understand why my laptop didn’t have any photo viewing/editing software. She did not want to upload it to the ‘cloud’ to do what she did in 5 seconds on her own laptop.

Maybe in the future Google will find a way to make these net apps run native on a computer. As it stands now, the cloud is not mature enough to satisfy an average user of a computer. Let alone a power user such as myself.

Google picked me

I got home yesterday to see a good size box waiting for me. I figured it was a few last minute gifts I had ordered from Amazon. To my surprise, it was a Chrome OS CR-48. I had applied a few weeks back and never heard anything form Google, so it just arriving at my door was a pleasant surprise. I am sure over the coming months I will be posting more on Chrome OS but here are some of my initial thoughts.

I have been using the CR-48 for about a day now and I am finding that in general I really like it. Having an always connected long battery life laptop seems to fit right into my life. Chrome OS provides a great experience on sub-notebooks that other operating systems such as Windows tend to miss. Sub-notebooks tend to be fairly underpowered when compared to full laptops. This means that even power users are in reality unable to do much on their sub-notebooks besides surf the web. Since Chrome OS specializes in this over Windows, it makes Chrome OS the perfect operating system for sub-notebooks.

At this point Chrome OS is very rough around the edges. Flash is the biggest draw back on the CR-48. When you have a web site that is heavy in Flash, it causes the whole computer to run sluggish. I suppose this is to be expected, flash has never worked well under Linux. I am sure Google will push Adobe to fix this before official release. Plus it will mean that Linux will finally get a decent version of Flash.

The coolest thing about the CR-48 is the concept. Chrome OS commoditizes the PC. When a user picks up a Chrome OS PC they don't think what is under the hood. The user just knows that they will be able to do whatever task they wanted to do. What is even cooler is that the laptop doesn't even need to be owned by that user. The user's preferences and  data will just be there as soon as they login.

I am glad that Google is going about this revolution of the PC in an open manner. They are not trying to be the ultimate authority when it comes to what is on your PC or what you can do with it, as Apple would do. They are releasing the operating system as open source software, unlike what Microsoft would do. I think of all companies Google is the best suited to lead this revolution.

Averaging A Group’s Salary

I recently wanted to find the average starting salary for some recent college graduates that had also worked at Surface Systems and Instruments. As most people know it is rude to ask what someone's salary is and can cause awkward situations. So I came up with the following algorithm to get a group of people's average salary without disclosing any information to anyone else, except for the average of course.

Person 1 comes up with a random number k. k should be much larger than person 1's salary. Person 1 should compute S1 + k, where S1 is person ones salary. Person 1 then tells person 2 and only person 2, S1 + k.

Person 2 takes S1 + k and adds S2 to it. Then gives person 3 S1 + k + S2.

Person 3 takes S1 + K + S2 and adds S3. Then gives it to person the next person.

On this goes till n people, where n is everyone we want the average salary of,  has added their salary to the running sum. The last person gives S1 + k + S2 + ... + Sn to person 1.

Person 1 subtracts k from S1 + k + S2 + ... + Sn and is left with S1 + S2 + ... + Sn. To compute the average, Person 1 simply takes (S1 + S2 + ... + Sn)/n and can then share it with everyone.

In order to make this secure, n must be grader than 2 and no person can give their sum to anyone else besides the next person in the line. Unfortunately, this does suffer from an attack. If person a and person a+2 want to know person a+1's salary, by subtracting the sum person a gave person a+1  from sum person a+1 gave person a+2.

So this is a fun method. My goal was to make it simple and practical to use. If you can find any other attacks please let me know.

Dual Monitor Aero Snap in Ubuntu

I think the coolest thing about Windows 7 is Aero Snap. It really helps you get more efficient use of your screen real estate without you have to manually resize your windows. The good people at OMG! Ubuntu posted a great tutorial on how to get Aero Snap in Ubuntu. I followed the tutorial and got my laptop snapping perfectly. I was not so lucky on my dual monitor desktop. So I have modified the script slightly to make it so it will now work on dual screen monitors. Most of this is exactly the same as the OMG! Ubuntu tutorial, so if you feel I didn't explain something clearly you should reference theirs.

Step 1 - Install Compiz Config

 
sudo apt-get install compizconfig-settings-manager wmctrl
 

Step 2 - Set up the commands

  • Open the Compiz Config manager (System -> preferences -> CompizConfig Settings Manager)
  • Select the "Commands" option
  • In "Command Line 0" past:
 
WIDTH=`xdpyinfo | grep 'dimensions:' | cut -f 2 -d ':' | cut -f 1 -d 'x'` && HALF=$(($WIDTH/4)) && wmctrl -r :ACTIVE: -b add,maximized_vert && wmctrl -r :ACTIVE: -e 0,0,0,$HALF,-1
 
  • In "Command Line 1" past:
 
WIDTH=`xdpyinfo | grep 'dimensions:' | cut -f 2 -d ':' | cut -f 1 -d 'x'` && HALF=$(($WIDTH/4)) && LOCATION=$(($WIDTH-$HALF)) && wmctrl -r :ACTIVE: -b add,maximized_vert && wmctrl -r :ACTIVE: -e 0,$LOCATION,0,$HALF,-1
 
  • In "Command Line 2" past:
 
wmctrl -r :ACTIVE: -b add,maximized_vert,maximized_horz
 

Step 3 - Enable the commands
Go to the 'Edge Bindings' tab and set the following:

  • Run Command 0 - Set to Left
  • Run Command 1 - Set to Right
  • Run Command 2 - Set to Top

Good luck, let me know if you have any questions.