Help with Slide #12 in Advanced Data Cleaning in R

I’m trying to work through the “Missing Data” mission for Advanced Data Cleaning in R. The challenge on Slide #12 gives this instruction:

The logical vector should represent whether the values in column in the mvc has a NA value or not. Where the logical vector is true, the value should be replaced with the equivalent value in sup_data.

The answer doesn’t make sense to me:

for (col in location_cols ) {
mvc[is.na(mvc[col]), col] <- sup_data[is.na(mvc[col]), col]
}

I don’t understand the syntax here. I was expecting an if_else. How is this accomplishing what it’s supposed to accomplish? And why is there another “col” after the is.na(mvc[col])?

Thanks.

Hey @AyBee2019. Great questions. This is a challenging exercise and base R provides concise syntax to accomplish the exercise objective. Let me break down what’s going on here.

To your comment about using an if_else statement here, you could use if_else to reach the same results.

To your first question, about how this accomplishing what it’s supposed to accomplish…the instructions state that the logical vector should represent whether the values in column in the mvc has a NA value or not. We can use is.na here because the is.na function returns a logical vector of the same length as its argument mvc[col].

To your second question about why there is another col…this represents the column that is currently being iterated over. In this example, it may be useful to think of the assignment operator <- as “gets the value of”. Writing out what’s going on here we can say: for the current column being iterated over, if the value of an observation in this column (col) of mvc has the value NA, then mvc “gets the value of” sup_data (where the observation mvc[col] is equal to NA).

The second col is used here to specify the sup_data column where the value should be pulled over from when the associated observation is.na(mvc[col]) is true. In base R subsetting (e.g. [ ), the value to the left of the comma refers to the observation (row) and the value to the right is the column.

I hope this helps. Let me know if you have any other questions!

1 Like