A pie chart is a circle divided into sectors that each represent a proportion of the whole. It is often used to show percentage, where the sum of the sectors equals 100%.
The problem is that humans are pretty bad at reading angles. In the adjacent pie chart, try to figure out which group is the biggest one and try to order them by value. You will probably struggle to do so and this is why pie charts must be avoided.
# Libraries
library(tidyverse)
library(hrbrthemes)
library(viridis)
library(patchwork)
# create 3 data frame:
data1 <- data.frame( name=letters[1:5], value=c(17,18,20,22,24) )
data2 <- data.frame( name=letters[1:5], value=c(20,18,21,20,20) )
data3 <- data.frame( name=letters[1:5], value=c(24,23,21,19,18) )
# Plot
plot_pie <- function(data, vec){
ggplot(data, aes(x="name", y=value, fill=name)) +
geom_bar(width = 1, stat = "identity") +
coord_polar("y", start=0, direction = -1) +
scale_fill_viridis(discrete = TRUE, direction=-1) +
geom_text(aes(y = vec, label = rev(name), size=4, color=c( "white", rep("black", 4)))) +
scale_color_manual(values=c("black", "white")) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=14),
panel.grid = element_blank(),
axis.text = element_blank(),
legend.margin=unit(0, "null")
) +
xlab("") +
ylab("")
}
plot_pie(data1, c(10,35,55,75,93))
If you’re still not convinced, let’s try to compare several pie plots. Once more, try to understand which group has the highest value in these 3 graphics. Also, try to figure out what is the evolution of the value among groups.
a <- plot_pie(data1, c(10,35,55,75,93))
b <- plot_pie(data2, c(10,35,53,75,93))
c <- plot_pie(data3, c(10,29,50,75,93))
a + b + c
Now, let’s represent exactly the same data using a barplot:
# A function to make barplots
plot_bar <- function(data){
ggplot(data, aes(x=name, y=value, fill=name)) +
geom_bar( stat = "identity") +
scale_fill_viridis(discrete = TRUE, direction=-1) +
scale_color_manual(values=c("black", "white")) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=14),
panel.grid = element_blank(),
) +
ylim(0,25) +
xlab("") +
ylab("")
}
# Make 3 barplots
a <- plot_bar(data1)
b <- plot_bar(data2)
c <- plot_bar(data3)
# Put them together with patchwork
a + b + c
As you can see on this barplot, there is a heavy difference between the three pie plots with a hidden pattern that you definitely don’t want to miss when you tell your story.
Even if pie charts are bad by definition, it is still possible to make them even worse by adding other bad features:
The barplot is the best alternative to pie plots. If you have many values to display, you can also consider a lollipop plot that is a bit more elegant in my opinion. Here is an example based on the amount of weapons sold by a few countries in the world:
# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/7_OneCatOneNum.csv", header=TRUE, sep=",")
# plot
data %>%
filter(!is.na(Value)) %>%
arrange(Value) %>%
mutate(Country=factor(Country, Country)) %>%
ggplot( aes(x=Country, y=Value) ) +
geom_segment( aes(x=Country ,xend=Country, y=0, yend=Value), color="grey") +
geom_point(size=3, color="#69b3a2") +
coord_flip() +
theme_ipsum() +
theme(
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_blank(),
legend.position="none"
) +
xlab("")
Another possibility would be to create a treemap if your goal is to describe what the whole is composed of.
# Package
library(treemap)
# Plot
treemap(data,
# data
index="Country",
vSize="Value",
type="index",
# Main
title="",
palette="Dark2",
# Borders:
border.col=c("black"),
border.lwds=1,
# Labels
fontsize.labels=0.5,
fontcolor.labels="white",
fontface.labels=1,
bg.labels=c("transparent"),
align.labels=c("left", "top"),
overlap.labels=0.5,
inflate.labels=T # If true, labels are bigger when rectangle is bigger.
)
Any thoughts on this? Found any mistake? Disagree? Please drop me a word on twitter or in the comment section below:
A work by Yan Holtz for data-to-viz.com