BLACK FRIDAY EXTRA SAVINGS EVENT - EXTENDED
START FREE

Filling Unknown Values with a Placeholder

Screen Link:
https://app.dataquest.io/m/402/working-with-missing-data/10/filling-unknown-values-with-a-placeholder

My Code:

#DQ Solution:

for (x in 1:5 ){
    v_col <- paste('vehicle', x,  sep = "_" )
    c_col <- paste('cause_vehicle', x,  sep = "_" )
    
    # create a logical vector for each column
    v_missing_logical  <-  is.na(mvc[v_col]) & !is.na(mvc[c_col])
    c_missing_logical  <-  !is.na(mvc[v_col]) & is.na(mvc[c_col])
    
    # replace the values matching the logical vector for each column
    mvc <- mvc %>%
        mutate_at(c(v_col), function(x) if_else(v_missing_logical,"Unspecified", v_col ))
    
    mvc <- mvc %>%
        mutate_at(c(c_col), function(x) if_else(c_missing_logical,"Unspecified", c_col ))
}


#My Code

for (x in 1:5 ){
   v_col <- paste('vehicle', x,  sep = "_" )
   c_col <- paste('cause_vehicle', x,  sep = "_" )
   v_missing_logical <- is.na(mvc[v_col]) & !is.na(mvc[c_col])
   c_missing_logical <- !is.na(mvc[v_col]) & is.na(mvc[c_col])
    
    mvc <- mvc %>% 
    mutate(c(v_col) = if_else(v_missing_logical,"Unspecified", v_col )) %>%
    mutate(c(c_col) =  if_else(c_missing_logical,"Unspecified", c_col ))

    }

The DQ mission that went over mutate_at said that mutate_at was meant to apply 1 function across multiple columns. DQ used an example of applying as_numeric() across multiple columns using mutate_at(). Why does the solution above uses mutate_at() even though it is in a loop processing only 1 column at a time? Is c() used around v_col and c_col in order to treat the strings as vectors?

For my answer, I was able to get the correct output without mutate_at() and without piping using mvc[v_col] = if_else… and mvc[c_col] = if_else…but piping seems to be very much encouraged throughout all the DQ missions. My code that is also posted above is attempting to use mutate() instead of mutate_at(), but I do not understand why it is not working for me. Is it possible to complete this problem using mutate()?

This explains what mutate vs mutate_at does (it also includes mutate_if but that’s not what you are asking here)

From that link

  1. mutate Creates new columns based on existing ones
  2. mutate_at Edits specific columns with a character vector or vars()

So in this case DQ wanted to show how you can apply a function to multiple columns

While you can use mutate for this problem you are going over multiple columns so mutate_at would be better (and also the c() around anything in R means turn this into a vector.)

Also for the mutate call you can use the across function but that hasn’t been covered here on DQ

 mvc <- mvc %>% 
    mutate(across(c(v_col) = if_else(v_missing_logical,"Unspecified", v_col ))) %>%
    mutate(across(c(c_col) =  if_else(c_missing_logical,"Unspecified", c_col )))

Also please look at this for how it can be done in mutate (I found this and the other link using the google search mutate vs mutate_at)

1 Like

@manandwa Thank you so much for your detailed response. This helped a lot. Your explanation makes sense to me that you use mutate_if() to edit the columns and mutate() to create new ones, but I am not sure if I fully understand how for loops work in general.

I was thinking that for (x in 1:5) in our code processes each column (eg vehicle_1, cause_vehicle_1) individually then moves onto the next (vehicle_2, cause_vehicle_2) till it ends with the 5th iteration. So I figured mutate() would work for vehicle_1 and cause_vehicle_1 first, and then it would repeat for 2,3,4,5. Now from both DQ’s code and your explanation, it seems like 1:5 passes through all at once, therefore requiring across() to be used alongside mutate() . Is this the correct way to think about this?

1 Like

@darrylhua That is why this community is here to help which I am happy to do

All the for (x in 1:5) is doing is going through position 1 through 5 for both of the columns. across is correct if you are using mutate as you are applying the function across the column which is why it is easier to use mutate_at

1 Like