244-5 Guided Project: Building A Handwritten Digits Classifier (Not seeing the "vast improvement")

I just wanted to throw this out here, in case I’m missing something.

I’m working the “Guided project” Here

It suggests doing a “K-Nearest Neighbors” as a kind of baseline, and then suggests doing a single-layer neural network, and eventually multiple layer neural networks. What’s throwing me off is; after doing the KNN, and the one-layer neural network - the author states…

Adding more neurons to a single hidden layer vastly improved the classification accuracy.

Well, it didn’t actually :smiley: In fact, the neural networks were only slightly better in accuracy than the KNN :smiley: And while adding more neurons to the layer DID improve the result of the accuracy score, the improvement wasn’t “dramatic”, and it also (almost immediately) overfit the model. The BEST model (after the KNN), was actually the Neural-net with only 8 nodes in the single layer.

So, I’m just curious if I’ve done something wrong.

I’ve attached the notebook…
GuidedProject_BuildingAHandwrittenDigitsClassifier.ipynb (1.5 MB)

Hey, Mike. Very well put question, this makes it so much easier to answer.

The author means to compare neural networks with fewer nodes with neural networks with more nodes, in this sentence.

I disagree. Let’s take a look at your notes:

The accuracy on the test set improved roughly \small{6.5\%} with both activation functions. This is huge at this level of accuracy. Anecdotally we can look at a Kaggle competition and see just how close the first twenty submissions are:

Source: https://www.kaggle.com/c/zillow-prize-1/leaderboard

This is a legitimate concern. There is evidence of overfitting, but there is also evidence that overfitting isn’t a serious issue. I modified one of the cells in your notebook to include the mean of the cross validated scores and reran the whole thing:

We still see the accuracy vastly improving and the fact that this happened after cross validation supports that cross validation isn’t a serious issue.

So there’s contradicting information. It’s very common that there is some subjectivity around these things. Data science is not mathematics. In my opinion you raised some legitimate concerns and I’m leaning towards overfitting not being a problem here.

1 Like


Thanks for the perspective. Good to know the scale for accuracy.

So, I had been using the difference between the test and train accuracies to determine the fit quality of the model. I.e. the closer the test accuracy was to the train accuracy, the better the fit.

AND I was calling anything over a 99% accuracy on the train DF as defacto “overfit”.

Are either of these a good method?

What you did is a good method, but it’s not the only one. Overfitting isn’t a static definition, it’s like trying to define when, in the below image, the color is definitely green and definitely not blue.

Source: https://en.wikipedia.org/wiki/Color_gradient

And I do think there’s some amount of overfitting happening, as is evidenced by what you noted. I wasn’t clear at all in what I meant.

What I should have said is that overfitting does not support that the improvement isn’t massive. It is still massive as we can see by the change in accuracy in the test sets, even if the train sets perform much better. Thank you for questioning further, this made it clearer for me as well.

Also, despite the model overfitting, it’s still pretty good, I think (this will always depend on how much error you can live with).