# Split violin plots

#### R-bloggers 2013-06-25

(This article was first published on Ecology in silico, and kindly contributed to R-bloggers)

Violin plots are useful for comparing distributions. When data are grouped by a factor with two levels (e.g. males and females), you can split the violins in half to see the difference between groups. Consider a 2 x 2 factorial experiment: treatments A and B are crossed with groups 1 and 2, with N=1000.

`123456789`
``# Simulate datan.each <- 1000A1 <- rnorm(n.each, 2, 1)A2 <- rnorm(n.each, 1.5, 2)B1 <- rnorm(n.each, 4, 1.5)B2 <- rnorm(n.each, 0, 1)values <- c(A1, A2, B1, B2)treatment <- rep(c("A", "B"), each=n.each*2)group <- rep(c(1, 2, 1, 2), each=n.each)``

Boxplots are often used:

`12`
``par(bty="n")boxplot(values ~ group*treatment, main="Box plot", col=rep(c("purple", "lightblue"), 2))`` This gives us a rough comparison of the distribution in each group, but sometimes it’s nice to visualize the kernel density estimates instead.

I recently ran into this issue and tweaked the vioplot() function from the vioplot package by Daniel Adler to make split violin plots. With vioplot2(), the `side` argument specifies whether to plot the density on “both”, the “left”, or the “right” side.

`123456789101112131415161718192021`
``require(vioplot)require(devtools)require(digest)source_gist("https://gist.github.com/mbjoseph/5852613")plot(x=NULL, y=NULL,     xlim = c(0.5, 2.5), ylim=c(min(values), max(values)),     type="n", ann=FALSE, axes=F)axis(1, at=c(1, 2),  labels=c("A", "B"))axis(2)for (i in unique(treatment)) {  for (j in unique(group)){    vioplot2(values[which(treatment == i & group == j)],             at = ifelse(i == "A", 1, 2),             side = ifelse(j == 1, "left", "right"),             col = ifelse(j == 1, "purple", "lightblue"),             add = T)  }}title("Violin plot", xlab="Treatment")legend("bottomright", fill = c("purple", "lightblue"),       legend = c("Group 1", "Group 2"), box.lty=0)`` Last but not least, Peter Kampstra’s beanplot package uses beanplot() to make split density plots, but 1) plots a rug rather than a quantile box, 2) includes a line for the overall mean or median, and 3) makes it easier to change the kernel function.

`123456789`
``require(beanplot)beanplot(values ~ group*treatment, ll = 0.04,         main = "Bean plot", side = "both", xlab="Treatment",         col = list("purple", c("lightblue", "black")),         axes=F)axis(1, at=c(1, 2),  labels=c("A", "B"))axis(2)legend("bottomright", fill = c("purple", "lightblue"),       legend = c("Group 1", "Group 2"), box.lty=0)`` There are more ways than one to skin a cat, and what one uses will probably come to personal preference. 