Entry 7 of 13
ML Fundamentals Series
·1 min read

K-Nearest Neighbors Has No Training Phase: And That's the Whole Point

Every algorithm I've studied so far learns during training, it adjusts weights, builds trees, finds hyperplanes. KNN (K-Nearest Neighbors) doesn't. There is no training phase. The entire model is just: store all the data, and when a new point comes in, find its KK nearest neighbors and vote.

KNN is called a lazy learner because it defers all computation to prediction time. When you ask it to classify a new point, it measures the distance from that point to every training example, finds the KK closest ones, and returns whichever class appears most among them.

Distance is usually Euclidean:

d(p,q)=i(piqi)2d(\mathbf{p}, \mathbf{q}) = \sqrt{\sum_i (p_i - q_i)^2}

But you can use other metrics depending on the data type (Manhattan distance, cosine similarity for text, etc.).

The key hyperparameter is KK. K=1K = 1 means the new point just copies its closest neighbor, which overfits badly. Large KK means you're averaging over many neighbors, which can blur important distinctions (underfitting). The right KK is found via cross-validation or the elbow method: plot error rate against KK, pick where error stops dropping sharply.