Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Statistics LibreTexts

3: K-Nearest Neighbors (KNN)

( \newcommand{\kernel}{\mathrm{null}\,}\)

3.1: K nearest neighbors

Assume we are given a dataset where X is a matrix of features from an observation and Y is a class label. We will use this notation throughout this article. k-nearest neighbors then, is a method of classification that estimates the conditional distribution of Y given X and classifies an observation to the class with the highest probability. Given a positive integer k, k-nearest neighbors looks at the k observations closest to a test observation x0 and estimates the conditional probability that it belongs to class j using the formula

Pr(Y=j|X=x0)=1kiN0I(yi=j)

where N0 is the set of k-nearest observations and I(yi=j) is an indicator variable that evaluates to 1 if a given observation (xi,yi) in N0 is a member of class j, and 0 if otherwise. After estimating these probabilities, k-nearest neighbors assigns the observation x0 to the class which the previous probability is the greatest. The following plot can be used to illustrate how the algorithm works:

knn1.png

  • If we choose K=3, then we have 2 observations in Class B and one observation in Class A. So, we classify the red star to Class B.
  • If we choose K=6, then we have 2 observations in Class B but four observations in Class A. So, we classify the red star to Class A.

3.2: The Calculation of Distance

Since in k-NN algorithm, we need k nearest points, thus, the first step is calculating the distance between the input data point and other points in our training data. Suppose x is a point with coordinates (x1,x2,...,xp) and y is a point with coordinates (y1,y2,...,yp), then the distance between these two points is: d(x,y)=pi=1(xiyi)2

3.3: The Effect of K

As most statistical learning model does, if K is small, then, we use a smaller region of data to learn. This may cause over-fitting.This may cause under-fitting. And if K is large, then we use a larger region of data to learn. The following plot shows the effects of K. as K gets larger, the decision boundary appears linear.

knn2.png


3: K-Nearest Neighbors (KNN) is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

Support Center

How can we help?