Segmentation and Recognition of Fingers using Microsoft Kinect

Published in: Advances in Intelligent Syst., Computing, Vol. 508, PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORKS, 978-981-10-2749-9, 417755_1_En (5) (Springer)

Technologies and languages used: C, C#, Processing, MATLAB

About

In my senior year as an undergrad, I was quite fascinated with the world of Computer Vision and its implication in UX. The rise of VR technologies and Kinect results into a need of a novel UI that could be intuitive for the users. Keeping that in mind, I delved into the world of Computer Vision and Gesture Recognition with an aim of developing an algorithm that could identify the finger gestures of the user. I believe that a lot of information could be conveyed with the help of finger gestures as opposed to body movements and that increases the scope of range and type of commands that could be issued to a computer.

Note: When I submitted my paper, Microsoft had not yet launched their version 2 Kinect. But by the time my paper was published in Springer, Microsoft had released version 2 with an algorithm that could identify fingers. So, Microsoft basically beat me there.

Abstract

Hand gesture identification is a very important part of HCI. In this paper, I have presented a very efficient algorithm for finger segmentation. Using fingers as an input medium, our interaction with the computer can become easier. Microsoft Kinect, which is a depth sensor is used to capture the image which is used for finger segmentation. Background is removed from the captured image by accepting pixels, which fall in a fixed range of depth. The image is further pre-processed and then palm area is identified and removed to obtain separate fingers. Further, to identify open fingers as gesture-kNN classifier is used. This proposed algorithm has achieved more than 90% accuracy.

Keywords: Microsoft Kinect, Gesture recognition, Finger Segmentation, centroid, Feature Extraction, kNN Classifier

Algorithm in a nutshell

The proposed algorithm involves a lot of math( if you love math, you can check out the full paper linked below). I would spare the pain of making you look at the all the mathematical operations involved and show you how the processing works in real time on a finger gesture using an example:

Normal
0

false
false
false

EN-US
X-NONE
GU

<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="false"
DefSemiHidden="false" DefQFormat="false" DefPriority="99"
LatentStyleCount="381">

<w:… — (a) An original gesture image (b) boundary of hand smoothen(c) holes are removed from the image (d) image with the centroid of the hand(e) image with the circle drawn from centroid with minor axis diameter (f) non-finger components are removed from image (g) fingers cropped from the image

Now that we were able to crop the fingers, we will have to make the computer understand the thing shown in (g) is two and not a random sequence of 1s and 0s. For, that, I have used a method called profiling. I profiled the image using horizontal, vertical and diagonal profiling methods and the accuracy was 91.20%. I also tried using SVM, but it only gave an accuracy of 65%.

Full paper: Finger Recognition

Finger Gesture Recognition

Segmentation and Recognition of Fingers using Microsoft Kinect