Hi, everybody. I’ve just finished the project and would like to get your feedback. To be honest, I was a bit dissapointed about the choice of the dataset, as it’s not that clear how important the feature management is.
I’ve read the given answer by the DQ-team and several projects from other students and found just one interesting idea of improving the model’s performance. It is Adam’s approach to work with outliers. But it seems we can’t drop outliers just to decrease RMSE as we miss some important experiments (Adam mentioned it in his conclusion).
Another thing I’d like to mention here: I don’t understand why we are suggested to create a function for data cleaning? As I understand, the idea of using functions is to avoid repeating the same code several times, but the data cleaning is the proces we do only once