Analyzing Forest Fire Data, re-order days bar charts help

, ,

https://app.dataquest.io/m/277/guided-project%3A-analyzing-forest-fire-data/4/working-with-factor-data

fires <- fires %>%
  mutate(day=factor(day,levels=c("sun","mon","tue","wed","thur","fri","sat")))
fires_by_day <- fires %>% group_by(day) %>% 
  summarize(total_fires=n())
ggplot(data=fires_by_day)+
  aes(x=day,y=total_fires)+
  geom_bar(stat="identity")+
  theme(panel.background = element_rect(fill="white"),
        axis.line=element_line(size=0.25,colour="black"))


What I expected to happen:
To be shown a bar chart with all of the days of the week in order.

What actually happened:
The bars are in order from sun to wed. Then it’s fri and sat. The last bar is NA.

![image|565x426](upload://2waZmgcfpiz3b2wwq2cB6oElo2b.png) 

I’ve also tried switching around when the code for factoring the days of the week goes. When I put it after the code for the graphs, all of the bars are marked for a day of the week. But, they are out of order. When I did the code for creating the bar charts that show how often fires happen per month, I did the factor part before the code for the bar charts. They were all in order when I did that.

1 Like

Hi,

I have tried this mission and everything works on my side as expected. Seems like your data set is changed in some way (because there are no NA values for example if you count by day).

forest <- read_csv("forestfires.csv")
forest %>% 
  mutate(day = factor(day, levels = c("mon", "tue", "wed", "thu", "fri", "sat", "sun"))) %>% 
  count(day) %>% 
  ggplot(aes(day, n)) + 
  geom_col()

Another thing I’ve just noticed is this part in your code:
ggplot(data=fires_by_day)+
aes(x=day,y=total_fires)

Why aes() outside of ggplot() / geom_()?? And why +aes()?

That was an error on my part. Thanks for pointing that out.

Below is my current code. The month graph is organized correctly. The day graph now includes ‘thu’ but the days are completely out of order. Thank you for the help you’ve given me.

library(readr)
library(ggplot2)
library(dplyr)
library(purrr)
fires <- read.csv("forestfires.csv")
View(fires)
fires <- fires %>% 
  mutate(month=factor(month,levels=c("jan","feb","mar","apr","may","jun",
                                     "jul","aug","sep","oct","nov","dec")))
fires_by_month <- fires %>% group_by(month) %>% summarize(total_fires_month=n())
ggplot(data =fires_by_month,
  aes(x=month,y=total_fires_month))+
  geom_bar(stat="identity")+
  theme(panel.background=element_rect(fill="white"),
        axis.line=element_line(size=0.25,
                               colour="black"))

fires_by_day <- fires %>% mutate(day=factor(day,levels=c("sun","mon","tue","wed","thur","fri","sat")))%>%
  count(day)

fires_by_day <- fires %>% group_by(day) %>% 
  summarize(total_fires_day=n())

ggplot(data=fires_by_day,
  aes(x=day,y=total_fires_day))+
  geom_bar(stat="identity")+
  theme(panel.background = element_rect(fill="white"),
        axis.line=element_line(size=0.25,colour="black"))

Hey,

  • You repeat twice the aggregation in you code:
    count(day) is the same as group_by(day) and then summarize(n())
    So, when refactoring levels of a factor, you use count(day) and then again group_by() and summarize() right after it.

  • the reason why NA present on your graph is because you used ‘thur’ for the day and it should be ‘thu’.

Here is your corrected code:

fires <- read_csv("forestfires.csv")
fires <- fires %>% 
  mutate(month=factor(month,levels=c("jan","feb","mar","apr","may","jun",
                                     "jul","aug","sep","oct","nov","dec")))
fires_by_month <- fires %>% group_by(month) %>% summarize(total_fires_month=n())
ggplot(data =fires_by_month,
  aes(x=month,y=total_fires_month))+
  geom_bar(stat="identity")+
  theme(panel.background=element_rect(fill="white"),
        axis.line=element_line(size=0.25,
                               colour="black"))

fires_by_day <- fires %>% mutate(day=factor(day,levels=c("sun","mon","tue","wed","thu","fri","sat")))%>%
  count(day)

ggplot(data=fires_by_day,
  aes(x=day,y=n))+
  geom_bar(stat="identity")+
  theme(panel.background = element_rect(fill="white"),
        axis.line=element_line(size=0.25,colour="black"))

Thank you so much for your help. My R script is running how it should.

1 Like