Pattern Recognition using K-NN Algorithm

- Pentaho

I. Introduction

The K-Nearest Neighbors (K-NN) method of classification is one of the simplest methods in machine learning, it is essentially classification by finding the most similar data points in the training data, and making an educated guess based on their classifications. Although very simple to understand and implement, this method has seen wide application in many domains, such as in Recommendation Systems, Semantic Searching, and Anomaly Detection.

II. Concepts and Application of K-NN

K-Nearest Neighbor algorithm works on the basis of minimum distance from the query instance to the training samples to determine the K-nearest neighbors. In other words, if you are similar to your neighbors, then you are one of them. For example, if apple looks more similar to banana, orange, and melon (fruits) than donkey, dog and cat (animals), then most likely apple is a fruit.



Consider the image shown above, there are two classes in the data i.e, the blue square and red triangle. The green dot needs to be sorted out either in the mentioned two classes. K-NN sorts this issue by finding the nearest neighbors and classifying the data into the appropriate class. In this example, the green dot will mostly be classified as red triangle.


K-Nearest Neighbor techniques for Pattern Recognition are often used for theft prevention in the modern retail business. You’re accustomed to seeing CCTV cameras around almost every store you visit, you might imagine that there is someone in the back room monitoring these cameras for suspicious activity, and perhaps that is how things were done in the past. But today, a modern surveillance system is intelligent enough to analyze and interpret video data on its own, without a need for human assistance.

K-Nearest Neighbor is also used in Retail to detect patterns in credit card usage. Many new transaction-scrutinizing software applications use kNN algorithms to analyze register data and spot unusual patterns that indicate suspicious activity. For example, if register data indicates that a lot of customer information is being entered manually rather than through automated scanning and swiping, this could indicate that the employee who’s using that register is in fact stealing customer’s personal information. Or if register data indicates that a particular good is being returned or exchanged multiple times, this could indicate that employees are misusing the return policy or trying to make money from doing fake returns.

Other Application’s of K-NN:

  • Concept Searching
  • Recommender Systems

III. Conclusion

The K-Nearest Neighbors (KNN) algorithm is a simple, supervised machine learning algorithm that can be used to solve classification problems. It’s easy to implement and understand, but has a major drawback of becoming significantly slow as the size of the data grows.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>