What's the difference between my R code and the model answer?

Screen Link:
https://app.dataquest.io/m/323/data-cleaning-with-r/7/class-size-data-creating-a-key-using-string-manipulation

My Code:

class_size<- class_size%>% mutate(DBN=str_c("0",CSD, `SCHOOL CODE`))

What actually happened:
It wasn’t what the system expected. The answer is:

class_size <- class_size %>%
  mutate(DBN = str_c(CSD, `SCHOOL CODE`, sep = "")) %>%
  mutate(DBN = str_pad(DBN, width = 6, side = 'left', pad = "0"))

I just don’t understand why there is a need to put str_pad. I felt that my code was more convenient than using str_pad. Also, why do we need sep="" in str_c?

1 Like

This is because we need the strings to have the same number of characters (6), str_pad is used to pad string.

DataQuest Says…


DBN values all have six digits – there is a zero at the beginning of the number.

So we need to ensure that the DBN values are of length 6, and if not we add 0 at the left.

This is the separator of the concatenation, you can refer to the Documentation. By default sep="" so it is optional.