Generating Random Data

Screen Link:

Working With Missing Data | Dataquest

My query is regarding the below code given as an example. I have tried to read about it, but this seems like a complicated version. I would like to understand:

  1. [1.0, np.nan] - i understand this parameter is an array. However, I do not understand the syntax.
  2. p =[0.3,0.7] - again, i understand that this probability. Are we not biasing the dataframe with these probabilities ? Also, if you could help in understanding the syntax - is 0.3 for x and 0.7 for y -dimension variables?


What specifically is confusing you about the syntax here?

The dataframe being created with the randomly generated values is to only show the use of isnull() and sum(). There is not really a need to discuss any biases or why/how that particular p was selected.

I would suggest checking out the documentation for the function.

1 Like