Building A Handwritten Digits Classifier - Why 1 neighbour gives best results?

Hello,

Could someone please provide me with an explanation on why the KNN algorithm with ONLY ONE neighbour has the best results of all? Better than KNN with more neighbour and better than ANY Neural Network model?

Can it be that the digits, within a group, are so similar to each other, that getting the BEST neighbour will always give us the best prediction?

Thank you
Regards
Daniel

The 2nd article lets you adjust k on a slider to see how predictions change.


http://scott.fortmann-roe.com/docs/BiasVariance.html