Interpolation question

I’m trying to fill out nan values using interpolation since it’s a time series data. I thought it would be sense to interpolate the missing variables in specific columns by corresponding countries; however, none of my nan values got filled out. Whereas if I just generally applied interpolation, my nans all got filled out. My questions are: is the second method correct? And how so? What did I do wrong with my first method?
Thank you for your help!

1 Like

It looks like in the first pic, you applied the interpolation to a single data point. The interpolate function should be applied to a Series, because the other values along the series are used to determine what the missing values ought to be. When you use the loc function on a dataframe with both a row and column specified, you’re only getting back a single label.

When you instead did:
df5['column'].interpolate(inplace = True),

You were interpolating a Series object, so the missing values in that series had non-missing values on which to ‘reference’ from.

That’s my take on it from what you posted. You could verify the suitability of the missing values by simply manual checking.

1 Like

Thank you for explaining it so clearly! I did manual checking and you were correct :slight_smile:

A nice question and a beautiful answer.