Star Wars Survey-Overview: 201-2

Hi all,

I’m working on Star Wars survey overview. In the step of removing NaN in the RespondentID row, I found my data a bit odd. I checked the dataframe before cleaning and compare the result of cleaning by checking it again. I found that the input didn’t change. The RespondentID mentioned the same 1186 non-null value before and after cleaning.
However based on the introduction, the total 835 responses received from survey.
I was using the code below:

> #checking the dataframe info
> print(star_wars.info())
> 
> #removing any rows where RespondentID is NaN
> star_wars = star_wars[star_wars['RespondentID'].notnull()]
> print('\n')
> #checking the dataframe after removal
> print(star_wars.info())

This is my notebook file:
Star_wars_survey.ipynb (42.8 KB)

Could anyone tell me the insight?
Thank you in advance for your time.

Best regards,

Click here to view the jupyter notebook file in a new tab

The RespondentID mentioned the same 1186 non-null value before and after cleaning.

Before you removed the Null values you had 1187 rows -

RangeIndex: 1187 entries, 0 to 1186

After you remove the Null values you have 1186 rows -

Int64Index: 1186 entries, 1 to 1186

There is only 1 null value in that row

If you print out

star_wars["RespondentID"].isnull().sum()

before you remove the Null values, you will see that there is only 1 null value.

Hi @the_doctor,

Thank you for your response. I didn’t realize that I should check the line you mentioned.
Have you got any thought why the entries (1187) is more than the total response (835) received by the survey?

Best regards,