Data Analyst in R - Select vs Group_by: What's the difference?

Hi, everyone!

Could anyone explain to me the difference between the select() function and the group_by() function in R? They seem to do the same thing. Not sure when I should use one or the other.

Select() is used as a means of subsetting the data frame to only include certain columns. On the other hand, group_by() is used to reorganize the data frame according to that given column.

1 Like


They are dplyr functions.

The select() function, however, is used to select columns of a dataframe while the groupb_by() function is used to group columns by their entries.
Using select(), the first argument is the dataframe, and the subsequent arguments are the columns to keep.
Using group_by(), you intend to splits the data into groups - the split-apply-combine idea - split the data into groups, apply some analysis to each group, combine the results.

1 Like