October 17, 2019

R ggplot2 geom_bar order

In R, to plot a pie chart using a dataframe as the data source, one would use geom_bar first to plot a bar graph, and then use coord_polar to convert it to a pie chart. It works, but there is a catch:
  • geom_bar sorts the fill value alphabetically and always put the lowest value one on the top of the bar graph. For example, “apple” would be on the top of “pear”. This becomes a problem if you intend to label the pie chart.
  • The solution is to sort the dataframe in decreasing order by the fill variable. This way all calculations will be right.

Example R code with correct labels.

df=data.frame(val=c("z","b","c"),c2=c("b","d","f"),Freq=c(5,3,4))

df=df[order(df$val,decreasing=T),]
df$Label <- paste(df$val, paste(round((df$Freq/sum(df$Freq))*100,0),"%",sep=""), sep="-")
p <- ggplot(df,aes(x=1,y=Freq,fill=val))+geom_bar(stat="identity", color = "black")
p1 <- p + coord_polar(theta='y') + theme(axis.ticks=element_blank(),
                                axis.text.y=element_blank(),
                                axis.text.x=element_text(colour='black'),
                                axis.title=element_blank())
p2 <- p1 + scale_y_continuous(labels= df$Label,breaks=cumsum(df$Freq) - df$Freq/ 2)
p3 <- p2 + labs(title=title,fill=legend_title)+theme(plot.title = element_text(hjust = 0.5))
print(p3)

Update on 10/23/2019

This article explains the order issue very clearly:

https://sebastiansauer.github.io/ordering-bars/

In short, the rule for the sort order is:
  • if factor, the order of factor levels is used
  • if character, an alphabetical order is used
So one can change the factor order to change the plot order.

No comments:

Post a Comment