Jupyter Notebook Question

Screen Link:

My Code:
combined_updated['dissatisfied'] = combined_updated["dissatisfied"].fillna(False)
combined_updated.head(45)
combined_updated['dissatisfied'].value_counts()

What I expected to happen:

Replace this line with the code output you expected to get

What actually happened:
So I am trying to replace the missing values in the dissatisfied column with the value that occurs the most frequently in the column which in this case, is False. However, all of the values for the dissatisfied column ended up being replace with the False value
Replace this line with the actual output/error


Also, with combined_updated['dissatisfied'].value_counts() I got this:

I am not sure why all of the values ended up changing to False. Do I need to change up stuff with the fillna() method or is it possible that I may have made some mistakes in the previous steps?
Thank you
-Salem

Hello, Salem.

Could be. Can you share the output for the dissatisfied column before you fillna? A value_counts would be useful too.

1 Like

Hi @salemabdulkerim

if you will allow me, let’s see if what I’m going to tell you serves you well:

  • First of all (what I would do) and to make sure what you suspect the first thing I would do would be to make use of .unique() explanation here

  • Second thing in the event that certainly all the values have become False would be to read the documentation:

looking down there is a parameter called method in which they give you the possibility to choose how you want that filling to be made:

**method** {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None

Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap.

if you look down the page you will see what the effect is in the examples they give you.

If it helps, and as a practice I want to show the strategy that I followed to avoid what has happened to you:


:exclamation: my github: cell 128 :exclamation:

How many nan and None values did we have?

[129] combined_updated['age'].isna().sum()

87

Final step to fill in the elements:

  • we know the number of times it has to iterate which is 87.

  • from the list set_age_for_randomize we get from each iteration a random value.

[130] fill_spaces = []
for i in range(0,na_amount):
    m = np.random.choice(set_age_for_randomize, p=None)
    combined_updated['age'] = combined_updated['age'].fillna(value=m).copy()

[131] combined_updated['age'] = combined_updated['age'].astype(float)

[132] combined_updated['age'].head(5)

0    38.0
1    43.0
2    33.0
3    48.0
4    33.0
Name: age, dtype: float64

[133] combined_updated['age'].isna().sum()
0


You can do the same with Boolean values (more easy) by preventing the content of the gaps from being propagated by having chosen one parameter or another.

With showing my example I do not intend to go from anything more than someone who encountered the same problem as you and I realized that there was a lot to work behind the documentation.

Greetings, I hope I have been able to help you :wink:

A&E.

2 Likes

Yep. Here it is:


-Salem

I think I see where you are coming from. I thought you could use the fillna method in just one line without having to use a for loop. Also, I see that the Dataframe.fillna() method only works with any value that is NaN. I feel like the challenge is to change the - with False
-Salem

1 Like

I think I figured it out. Here is a screenshot of what I did:

1 Like


Now I have no missing values for the dissatisfied columm

1 Like

Nice, I’m glad that you were able to solve it.

Yep, you’re correct. This part of the project did mention replacing the :

Most of the time you only need one line, unless you want to fillna for multiple columns with different values for each column. With that said, you can also pass a dictionary to value and that should also allow you to update multiple columns with different values for each.

The examples section in the documentation below can be useful:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html

Cheers.

2 Likes