I’m working on the guided project “Exploring Ebay Car Sales Data” in the introductory Pandas and Numpy course.
I’m at the step of removing outlier data from the columns “Price” and “Odometer”.
In “removing” the data, do I actually want to remove those entire rows from the dataframe? Or should I convert the values in the relevant columns to NaN values? I’m afraid removing thousands of rows due to a single faulty datapoint might skew other aspects of the analysis.
I feel like the instructions aren’t totally clear at this point (Screen #4 of the mission). Any thoughts?