Artificial Vision -I : Getting comfortable with technology
The story is just some time back, probably last year. When I was delivering a lecture on Open Source and Computer Programming with ‘C’, where I explained about them about opens source and computer vision. This entire article is narration of the events, where I was explaining students about computer vision and how they can start computer vision by just learning a new library opencv. How a ‘C’ programming lecture became a new discussion forum for students to understand computer vision technology and how students can learn computer vision just by experimenting things. This article will be a series of articles where I will let readers at ease with computer vision algorithm.
Me: Have you ever clicked a photograph?
Students started shouting Yes sir many times….
Me: Ok. So what is the difference between a human eye and computer camera?
Ankur: They both do some things.
Me: Good. Now tell me how you differentiate between a glass and a cup?
Ankur: Sir, its our mind which analyses the images which it receives and then tell us the difference.
Me: How a child of a year old do the same?
Ankur: Donno sir.
Me: We train the child and help him in differentiating things.
Me: I then asked. How many of you know how a robot sees and analyses it surroundings?
Abhishek: Its the cameras installed in robot.
Me: Then how it identifies between a human and a cup?
Abhishek: I think it is by the comparison of the pictures stored in the memory of the robot, which it matches and gives us the difference.
Me: Wow!! you seemed to have bit of understanding of it.
Then I started the topic of computer vision. It is a field of science, which extracts out information from images and gives vision to machine.
The I told the class that there will be 3 subsequent lectures and divided into 3 sections. First I will teach you about the computer vision and image processing and then will tell how can you carry out computer vision in ‘C’ programming language using external libraries.
Getting a feel of artificial vision
So Students!! Let me give you the basic definition of computer vision, is a field of science which deals with image processing and extracting meaningful information from those images after capture.
Lets take an example. Think of a place having CCTV camera installed on a street. The use of that camera is to find the traffic and take note of any untoward incident on that street. Those CCTV cameras are there for manual surveillance. If suppose a speedy car rams over another car and the culprit runs away. Then we have to wait for the incident to be reported to police and then police will take action by analyzing the captured video or by patrolling in the area and asking people about he incident. This consumes lot of time.
One of the solution to such a problem could be that if we could install a vision system to analyze situations like this and when some thing of that order happens, the number plate of the car or photograph of the car or some thing which is peculiar about the incident can be noted and an alert can be raised. An sms can be sent to the police control room and also to local police patrol vehicle. This will help police to capture culprit easily and in less time and money.
This is one of the application of computer vision. There are numerous other applications of computer vision systems. Some of them are like monitoring production in an industry, managing traffic, counting people for visual surveillance,etc.
I hope you have got enough idea about computer vision. Figure 1 displays one of the vision algorithm application called template matching. Figure 1(a) Displays result of the template matching algorithm by using computer vision. A red rectangle around an image shows the result of template matching algorithm. Figure 2(b) Displays an image that is to be found from figure 1(a).
Figure 1(a) A collage of different images (b) Image to be found
Me: Students are you now ready for the show!!
All jumped with joy!!
Me: I will now be explaining you about vision library. That we call it as “OPEN CV”. Opencv is computer vision library developed by Intel and first release was in 1999. It is an open source computer vision library for real time image processing. Its current release is 2.2 as on March 2011. Its library includes basic image processing functions, segmentation functions, machine learning algorithms, camera calibration, gesture recognition, ego motion, face recognition, object identification and much more. While learning opencv one must be comfortable in one of the programming languages namely C,C++ or python. One must have basic understanding of underlying data structures like matrix, structures and function calls. If you want to learn more about opencv you can learn it from the following link http://opencv.willowgarage.com/wiki/ or http://sourceforge.net/projects/opencvlibrary/ and if you are interested in some other resources they are given in the references.
Lets get our hands dirty!!
Amit: On which operating system we can run opencv?
Me: It can run on all major operating systems like Linux, Windows and Mac.
Amit: Will you be telling us how to install it?
Me: Yes, certainly.
Students: Sir! some of us know linux and some of us know only windows. Will you be explaining us for both of the operating systems.
Me: No. I will be explaining you how one can install it on Linux only. If you want to install it on windows operating system, you can read from http://opencv.willowgarage.com/wiki/InstallGuide.
Lets first see how we can install it on Linux operating system. I will use Ubuntu 10.10(Maverik). You must have Internet connection too as a prerequisite. I will now demonstrate the steps one by one.
First click on system->Administration->Synaptic Package Manager. Supply the root password.
Type opencv in quick search textbox. It will then show packages as in figure 2. Click on the following packages and mark them for installation:-
1. opencv-doc [optional]
After marking all of them click on apply for installation. When installation gets completed you can run following command shown in figure 3 to verify your installation.
Figure 3: Terminal Command for verification
This will show you directories where your include files are this will help you to identify all header files.
Opencv is comprised of bundle of image processing algorithms to help end user in programming. As a programmer we must understand the structure of opencv. Opencv components are described in figure 4. The structure contains four components like computer vision algorithms(cv), machine learning algorithms, highgui algorithms and cxcore algorithms. Some of the major functions that computer vision algorithms provide us are image filtering, image transformations, feature detection, motion analysis, structural analysis, object detection and camera calibration where as highgui component provides functions for creating user interface, reading/writing images and videos. Cxcore component of opencv contains basic data structures for images and video, drawing functions and many more.
We will now study basic image components and how these components can be altered with the help of opencv. Lets first understand an image and its components. Image is a collection of lots of information stored in form of pixels. A colored picture can be stored in any of the following formats like RGB, CMYK or HSV. Channel is number of colors available in an image. RGB (RED, GREEN and BLUE) is a 3 channel image, where each channel requires 8bit value. CMYK (CYAN, MAGENTA, YELLOW, BLACK)is a 4 channel image and lastly HSV(Hue Saturation Value) is a 4 channel image which is same as RGB but one addition and that is alpha or brightness.
I hope now you’ll must have got much understanding of computer vision, opencv and its components. We will now study our first program of opencv and learn how we can execute it?
My first program with opencv!!
Hey all lets now begin with our first program in opencv. We have used gedit to write our first program.
int main(int argc, char** argv)
printf(“Syntax: disp_image image-name\n”);
Listing 1: Image display program
Listing 1 displays our first program in opencv to display an image. We will explain each line one by one. As discussed earlier, highigui.h is a header file which includes functions which are used for reading images. Functions like cvLoadImage, cvNamedWindow, cvShowImage, cvReleaseImage and cvDestroyWindow. Lets understand these functions one by one.
1. IplImage* cvLoadImage(const char* filename,
int iscolor = CV_LOAD_IMAGE_COLOR)
cvLoadImage takes two argument filename and color. iscolor has three possible values namely CV_LOAD_IMAGE_COLOR, CV_LOAD_IMAGE_GRAYSCALE,
CV_LOAD_IMAGE_UNCHANGED. It loads an image.
2. int cvNamedWindow(const char* name, int flags) takes two argument one is name of window, which is user defined and secondly CV_WINDOW_AUTOSIZE flag which is used to set the width and height of window according to image size. It is used to create a window with its title.
3. void cvShowImage(const char* name, const CvArr* image) takes two arguments namely name of the image and image structure. It allocates memory for image.
4. void cvReleaseImage(IplImage **image) takes image as argument. It deallocates memory for image.
5. void cvDestroyWindow(const char*name) takes an argument as a name of window to be destroyed.
6. int cvWaitKey(int delay=0) takes delay in milliseconds. If we supply zero, it means that program has to wait till user presses any key.
Program works like this first we created a place holder for image i.e, IplImage structure. Afterwards we load the image and do not change anything int it. We can change the colour of the image by changing the iscolor option to CV_LOAD_IMAGE_GRAYSCALE. Next, we create a small window titled Output and set its flags to autosize so that as we load an image in window the size of the window is automatically adjusted. Thereafter, we displayed the image and wait till user presses any key to exit. In the last we destroyed all the window and released all the space occupied by image.
How to compile and execute the program?
We have used the command shown in figure 5 for creating a binary executable file.
Output of the program is given in figure 6.
Figure 6: Output of Listing 1
Lets get back to our conversation:-
Me: This example explained you about how in just 7 lines you can display a picture.
Amit: Sir! Can we do some basic operations on image like smoothing, converting an image to gray scale, inverting it etc.
Me: Yes. We can do it very easily. In my subsequent lectures I will demonstrate these things. So students I hope you have got enough understanding of computer vision and opencv.