A data scientist is developing a model for the prediction of cancer treatment effectiveness from a number of physiological and clinical factors, including BMI, age, weight, and various histological measurements. Treatment of a subject in the training dataset is recorded as effective if complete remission of the patient was observed, and ineffective otherwise. Justify your answers to each of the following questions:
(i) Is this a regression or a classification problem?
(ii) Can the data scientist use Mean Squared Prediction Error as a criterion to evaluate model performance?
(iii) Is this a supervised or unsupervised learning scenario?
I’m new to ML so don’t rely on this too much:
- classification - you either succeed or fail in cancer treatment so it’s 1 or 0 (though one could argue that in your question data scientist is pursuing the most effective treatment, so pursuing the most effective values is a different story, that’d be regression IMHO, but like I said I’m new to this )
- not sure about classification error measurement
- supervised - we’re training a model for predicting
1 Like