The
operating system used for the project is Ubuntu 11.04.The platform used for the
project is scientific python(scipy). SciPy (pronounced "Sigh Pie") is
open-source software for mathematics, science, and engineering. It is also the
name of a very popular conference on scientific programming with Python. The
SciPy library depends on NumPy, which provides convenient and fast N-dimensional
array manipulation. The SciPy library is built to work with NumPy arrays, and
provides many user-friendly and efficient numerical routines such as routines
for numerical integration and optimization. Together, they run on all popular
operating systems, are quick to install, and are free of charge. NumPy and
SciPy are easy to use, but powerful enough to be depended upon by some of the
world's leading scientists and engineers.
An additional library which we use is
open cv. OpenCV (Open Source Computer Vision Library) is a library of
programming functions mainly aimed at real time computer vision, developed by
Intel and now supported by Willow Garage. It is free for use under the open
source BSD license. It has C++, C, and Python and soon Java interfaces running
on Windows, Linux, Android and Mac. The library has more than 2500 optimized
algorithm. It is used around the world, has more than 2.5M downloads and 40K
people in the user group. Uses range from interactive art, to mine inspection, stitching
maps on the web on through advanced robotics. The library is cross-platform. It
focuses mainly on real-time image processing.
1.2.1.
Operation
Sequences:
1)
Webcam interfacing:
Here
we interface our program with webcam. Usually we prefer inbuilt webcam as in
laptop because of its apt position. Web camera’s with minimum 3 megapixel is
opted as it will be good in clarity and resolution so that the working will be
efficient and smooth.
2)
Grab frame:
Here
we grab the frame from captured video via webcam and these frames are imported
into our program.
3)
Segment out Finger:
Blob
is a region with adjacent pixels having same intensity values. Using blob
libraries
we detect the blob from the grabbed frame and
thus the finger is segmented out.
4)
Calculate
Finger position:
Exact
position of finger is calculated using open CV libraries.
5)
Identify
Gesture:
We
design different gestures for different operations. Implement the gesture
recognition to carry out following gestures.
Left click
Right Click
Double click
Left click and hold and release
6)
Apply
decision to mouse pointer:
Gestures
are identified and interfaced instead of hardware mouse to the computer.
IMPLEMENTATION
The project “Gesture Mouse” was the first step
towards a new world of interaction with the digital world. So inorder to
optimize our project we needed a reference and the reference should be a best
information processing system. Out of all searches, we realized the best
information processing system available in our current scenario which can
recognize the human gestures and act accordingly is the Human Information
Processing System. So we started our project by going through the way by which
the brain takes and processes the information. It was a tedious task when we
wanted to replace a biological system with a digital hardware system ensuring
that both deliver the same response. We used a camera instead of a human eye.
It was surprising that a camera’s properties like shutter speed, focus length,
ISO etc. were matching with a human eye. Some features of camera were better
than the human eye to capture a well defined image. Then python program was
implemented instead of brain processing inorder to process and recognize the
gesture from the sequence of captured images. After that, according to the
gestures different operations like left click, double click, right click and
mouse pointer movement are performed.
Main part of processing lies in the program written
in python with the help of computer vision and pymouse libraries. We divided
the task of processing into 6 stages. First step was to interface the webcam
with the program. Second stage was to grab the frame. Then from this frame
finger wearing caps of red or blue color are segmented out. After segmnentation
stage, positon of the extracted blob is calculated and then gestures are
recognized based on the position. On last stage, different operations are
performed based on the gesture recognized on the fifth stage.
Each stage needed different levels of processing
way. First step was initializing the interaction with the web camera and
buffering the captured images. Buffered images are stored in a image variable.
On each capture, the desire blob i.e., image of red or blue colored stuffs are
extracted and segmented from the image. Extraction is done by first converting
the image into HSV(hue, saturation and value) format and then setting the hue, saturation
and value range for red and blue colors. After extraction, a mask layer is
applied so that only the blobs of red and blue are segmented. For the purpose
of erasing the noise, erode and dilation operations of computer vision library
are used. So, after segmentation was done correctly we filtered the blobs.
Filtering was essential because there can be several red or blue colored stuffs
in captured image which got extracted and segmented. Filtering was done for red
and blue color separately based on the size of the caps (which is the blob in
the image) we used. Now, location of blobs was calculated and if there was a
red and blue blob then their relative positions and time of occurrence are
understood and realized whether it was a right click or left click or double
click. If there was a red blob only then it indicated the mouse movement and we
moved the mouse pointer to the co-ordinates located from the position of red
blob. This was easier when we used the inbuilt functions of pymouse library.
We took the image and done all the processing based
on HSV format. Actually the same project can be done on RGB or YCrCb or HSL
formats of image. The purpose behind on choosing the HSV format was, of all
other formats only HSV format was little bit independent of light variation.
The independent processing from light variation effects was the main problem on
the digital image processing concept. So to find out a better result out of all
other ways, HSV format became our option. So the RGB format of image captured by
the web camera was converted to HSV format using the inbuilt function
cvcvtcolor with argument specifying as the image captured and destination as
another image variable and the argument specifying function to be implemented
as cv_bgr2hsv which converts the RGB format to HSV format.
No comments:
Post a Comment