In R, to plot a pie chart using a dataframe as the data source, one would use
geom_bar
first to plot a bar graph, and then use
coord_polar
to convert it to a pie chart. It works, but there is a catch:
geom_bar
sorts the fill
value alphabetically and always put the lowest value one on the top of the bar graph. For example, “apple” would be on the top of “pear”. This becomes a problem if you intend to label the pie chart.
- The solution is to sort the
dataframe
in decreasing order by the fill
variable. This way all calculations will be right.
Example R code with correct labels.
df=data.frame(val=c("z","b","c"),c2=c("b","d","f"),Freq=c(5,3,4))
df=df[order(df$val,decreasing=T),]
df$Label <- paste(df$val, paste(round((df$Freq/sum(df$Freq))*100,0),"%",sep=""), sep="-")
p <- ggplot(df,aes(x=1,y=Freq,fill=val))+geom_bar(stat="identity", color = "black")
p1 <- p + coord_polar(theta='y') + theme(axis.ticks=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_text(colour='black'),
axis.title=element_blank())
p2 <- p1 + scale_y_continuous(labels= df$Label,breaks=cumsum(df$Freq) - df$Freq/ 2)
p3 <- p2 + labs(title=title,fill=legend_title)+theme(plot.title = element_text(hjust = 0.5))
print(p3)
Update on 10/23/2019
This article explains the order issue very clearly:
https://sebastiansauer.github.io/ordering-bars/
In short, the rule for the sort order is:
- if factor, the order of factor levels is used
- if character, an alphabetical order is used
So one can change the factor order to change the plot order.