Training HaarCascade Model on Microsoft Azure.

In this hands-on tutorial, we will learn how to train your own haar cascade model on Microsoft Azure. To understand Haarcascade I recommend you to read the seminal research paper on Face recognition by Viola Jones.

Apart from just using the prebuild haar cascade files, In this tutorial, I will teach you how to train your own model on Microsoft Azure to create your own Haarcascade files for object detection.

Before we start we will need to create an Azure Virtual Machine. Creating an Azure Virtual Machine is really simple and quick.

How to create your Azure Virtual Machine.

First login to your Microsoft Azure Account:

Select Virtual Machines from the left panel and then click on +Add

Fill out the details. In the machine size selected fill:

Leave the disks and networking sections to defaults and create the virtual machine.

Go to the Dashboard and select the new virtual machine you created.
Click on start your Virtual Machine will get started.

Click on connect to see your SSH login credentials:

Start the terminal on your PC. type:

Getting your Azure VM ready for training:

SSH via your Azure virtual machine credentials. Voila! You are in. Your VM is now ready.

From here I will show you how to create your very own Haar Cascades, so you can track any object you want. Due to the nature and complexity of this task, this tutorial will be a bit longer than usual, but the reward is massive.

Once you are inside your VM’s terminal via SSH:

cd ~

sudo apt-get update

sudo apt-get upgrade

First, let’s make ourselves a nice workspace directory:

mkdir opencv_workspace

cd opencv_workspace

Now that we’re in here, let’s grab OpenCV:

sudo apt-get install git

git clone

We’ve cloned the latest version of OpenCV here. Now let’s get some essentials:

Compiler: sudo apt-get install build-essential

Libraries: sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev

Python bindings and such: sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev

Finally, let’s grab the OpenCV development library:

sudo apt-get install libopencv-dev

To build a Haar Cascade, you need “positive” images, and “negative” images. The “positive” images are images that contain the object you want to find. This can either be images that just mainly have the object, or it can be images that contain the object, and you specify the ROI (region of interest) where the object is. With these positives, we build a vector file that is basically all of these positives put together. One nice thing about the positives is that you can actually just have one image of the object you wish to detect, and then have a few thousand negative images. Yes, a few thousand. The negative images can be anything, except they cannot contain your object.

From here, with your single positive image, you can use the opencv_createsamples command to actually create a bunch of positive examples, using your negative images. Your positive image will be superimposed on these negatives, and it will be angled and all sorts of things. It actually can work pretty well, especially if you are really just looking for one specific object. If you are looking to identify all screwdrivers, however, you will want to have thousands of unique images of screwdrivers, rather than using the opencv_createsamples to generate samples for you. We'll keep it simple and just use one positive image, and then create a bunch of samples with our negatives.

In my case, I am using the center of Rs 100 Note as the positive Image as I want to detect that via my Object detector.

A little-enlarged Image:

Ok great, getting a positive image is no problem! There is just one problem. We need thousands of negative images. Possibly in the future, we may want thousands of positive images too. Where in the world can we do that? There’s quite a useful site, based on the concept of WordNet, called ImageNet. From here, you can find images of just about anything. So how might we get negatives? The whole point of ImageNet is for image training, so their images are pretty specific. Thus, if we search for people, cars, boats, planes…whatever, chances are, there will be not any Rs100 notes. So, let’s find some bulk image URL links. I found the sports/athletics link to have a reported 1,888 images, but you will find a lot of these are totally broken. Let’s find one more: People.

Scrapping Images on Azure VM (It’s Damn Fast!)

Now we will need to scrap a few thousand images from ImageNet to our Azure Virtual Machine. We will write a Python code to do this. And then run this python code on our VM.

Use SCP via your terminal to send the python file to your VM’s directory.

Now the best part about using Azure for training is the super high internet speed. These few thousand images will be scrapped in a couple of minutes on your VM because your Azure VM has a very fast internet connection.

… To be continued.

Researcher and Graduate student at University at Buffalo. Software Engineer at Persistent Systems.