R Guided Project: Covid 19 trends problem

In Part 5 of the project, the top 10 tested countries should be selected. My code does not provide the correct answer but a single line summarizing all data in the given column instead of providing a summary grouped by countries. I cannot find the mistake I made:

library(readr)
covid_df <- read_csv("/Users/rakosicsilla/projects/covid19/covid19.csv")
covid_df_all_states <- covid_df > filter(Province_State == ‘All States’)

covid_df_all_states_daily <- covid_df_all_states > select(Date, Country_Region, active, hospitalizedCurr, daily_tested, daily_positive)

covid_df_all_states_daily_sum <- covid_df_all_states_daily > group_by(Country_Region) > summarize(tested = sum(daily_tested), positive = sum(daily_positive), active = sum(active), hospitalized = sum(hospitalizedCurr)) > arrange(desc(tested))
covid_top_10 <- head(covid_df_all_states_daily_sum, 10)

2 Likes

Hello @rakosics

Could you please provide here the output you have?

Because, your code seems correct to me when I ran it. The only difference I’ve seen is library(dplyr) but I think that you just forgot it.

library(readr)

library(dplyr) # I ADD THIS LINE TO YOUR PREVIOUS CODE

covid_df <- read_csv("/Users/jaoga/DQ/content/505/covid19.csv")

covid_df_all_states <- covid_df %>% 
  filter(Province_State == "All States")

covid_df_all_states_daily <- covid_df_all_states %>% 
  select(Date, Country_Region, active, hospitalizedCurr, daily_tested, daily_positive)

covid_df_all_states_daily_sum <- covid_df_all_states_daily %>% 
  group_by(Country_Region) %>% 
  summarize(tested = sum(daily_tested), positive = sum(daily_positive), active = sum(active), hospitalized = sum(hospitalizedCurr)) %>% 
  arrange(desc(tested))

covid_top_10 <- head(covid_df_all_states_daily_sum, 10)

Here is the result (and it is the correct one)

3 Likes

Hi, Johnaoga, thank you very much for your answer. First I thought that it won’t help because I loaded dplyr in RStudio in the right bottom pane by clicking it. Previously I checked whether the grouping was successful by deleting the summarizing part from the code at issue. From this I concluded that the problem must be with the ‘summarize’ command. And voilá, an idea came up in my strained mind: what if there are more ‘summarize’-s? I checked and found that this is the case: there is a ‘summarize’ in dplyr and another one in plyr, and I also ticked plyr with many others, and activated it. Thus, I unticked plyr, and got the same result as you! Wohooooo!

2 Likes

[email protected], my code is same as yours,but show the summarise() ungrouping output (override with .groups argument) in the console ,have you same sentence shows in console before? but my results same as you.i have no idea about whats problem .thanks

Hi, wwamity1314, in my case, the problem was that there are two ‘summarize’ functions in R, and if the other one is active, then it will produce an incorrect result. Try to check whether the package ‘plyr’ is active. If it is, then deactivate it (go to right bottom pane, choose Packages, and find ‘plyr’ in the list. There should be NO pipe in the little white box in front of it). I hope this helps.