in one exercise we were asked to evaluate the difference between a standard logistic regression and a NN with a single hidden layer, single neutron in the hidden layer and a sigmoid activation function. I am a beginner with respect to the neural net topic, so I cannot understand why the logistic regression perform much better (accuracy of 88%) compared to this NN (40%). I though that this NN was a more complex variation of the logistic regression and thus that it would have performed better.
They try to explain it by saying: “This network architecture doesn’t give the model much ability to capture nonlinearity in the data unfortunately, which is why logistic regression performed much better.”
But it is not really clear to me.
Thank you in advance for your help!
I haven’t done that lesson.
According to this diagram, a neural network looking like this should have the same output.
Does the lesson have exactly the same architecture?
I guess this is a little bit different because in the description of the exercise we were asked to:
- Train two different models using scikit-learn on the training set:
- A standard logistic regression model
- A neural network with:
- A single hidden layer
- A single neuron in the hidden layer
- The sigmoid activation function
And since the relative code was the following:
nn = MLPClassifier(hidden_layer_sizes = (1,), activation=‘logistic’)
nn_predictions = nn.predict(test_features)
nn_accuracy = accuracy_score(test_labels, nn_predictions)
I guess that in the exercise, in both in the hidden layer and in the output node the activation function was a sigmoid while in the example you forwarded me it looks like that the hidden layer has an identity function while the output has a sigmoid function, right?
I would also like to know the answer to @jessica.lanini. Is the cause of the lower accuracy in the NN due to the double-use of a sigmoid activation function?