I’m trying to fill out nan values using interpolation since it’s a time series data. I thought it would be sense to interpolate the missing variables in specific columns by corresponding countries; however, none of my nan values got filled out. Whereas if I just generally applied interpolation, my nans all got filled out. My questions are: is the second method correct? And how so? What did I do wrong with my first method?
It looks like in the first pic, you applied the interpolation to a single data point. The interpolate function should be applied to a Series, because the other values along the series are used to determine what the missing values ought to be. When you use the loc function on a dataframe with both a row and column specified, you’re only getting back a single label.

When you instead did:
df5['column'].interpolate(inplace = True),

You were interpolating a Series object, so the missing values in that series had non-missing values on which to ‘reference’ from.

That’s my take on it from what you posted. You could verify the suitability of the missing values by simply manual checking.

Thank you for explaining it so clearly! I did manual checking and you were correct :slight_smile:

