So... I was trying to build a neural network model from scratch

Hi guys,

In this section, we are building a digit classifier. I have taken the famous ML course from Andrew Ng on Coursera, so I thought, why not build a neural network model for classifying digits from scratch?

It took me days to build the feedforward function and the backpropagation function to compute the cost and gradient. Just when I thought everything’s done, and I plug my functions in the scipy.optimize.fmin_g, it doesn’t work as I expected. I think my cost function is fine because when I don’t plug in the cost gradient function in as the parameter fprime for fmin_gc(), in which case the gradient is approximated numerically, it does work. I set the maxiter = 100 in this notebook because otherwise, it takes too long in DQ env. As you can see the accuracy is pretty low.

If I plug in the cost gradient function in fprime, it only iters once and it’s the cost function is not optimized. I would love to start a discussion with anyone who has used the scipy.optimize library.

This project is not successful, but I still want to share it, and maybe get some feedback on how to fix it. All aside, I’ve learned a lot of quirky stuff about numpy with this project and obtained a much better understanding of the neural network.

It’s unfinished business, but for now, I just need to take a long break. :slightly_smiling_face:

Building a neural network model from scratch.ipynb (73.1 KB)

Click here to view the jupyter notebook file in a new tab

5 Likes

Hi @veratsien

I am currently taking a look, there is a problem with your Latex formatting in the “Randomly initialize the parameters for symmetry breaking” section.

I am not familiar the scipy.optimize libray so I don’t believe I can help a lot. But maybe you are pretty close to the solution when you say you cannot plug the cost gradient function in fprime into fmin_cg? Since it should iter more than once and you have identified the problem, sure you will find some help in StackOverflow or elsewhere from people who faced a similar issue.

I found this about scipy.optimize.minimize which seems related to the topic: How to return cost, grad as tuple for scipy’s fmin_cg function

Best
W.

1 Like

@WilfriedF Thank you so much for the StackOverflow link! I believe it could help. I’m gonna give it a try and will let you know if it works. I think the problem lies in the layering of my cost and gradient functions and how fmin_cg computes under the hood. I was tempted to recreate the cost and gradient function separately using parts of the feedforward() and backpropagation() function

And thanks for spotting the latex formatting. I will fix that.

Again, thank you so much for taking a look of this project. :star_struck:

1 Like