# Split violin plots

#### R-bloggers 2013-06-25

**Ecology in silico**, and kindly contributed to R-bloggers)

Violin plots are useful for comparing distributions. When data are grouped by a factor with two levels (e.g. males and females), you can split the violins in half to see the difference between groups. Consider a 2 x 2 factorial experiment: treatments A and B are crossed with groups 1 and 2, with N=1000.

123456789

`# Simulate datan.each <- 1000A1 <- rnorm(n.each, 2, 1)A2 <- rnorm(n.each, 1.5, 2)B1 <- rnorm(n.each, 4, 1.5)B2 <- rnorm(n.each, 0, 1)values <- c(A1, A2, B1, B2)treatment <- rep(c("A", "B"), each=n.each*2)group <- rep(c(1, 2, 1, 2), each=n.each)`

Boxplots are often used:

12

`par(bty="n")boxplot(values ~ group*treatment, main="Box plot", col=rep(c("purple", "lightblue"), 2))`

This gives us a rough comparison of the distribution in each group, but sometimes it’s nice to visualize the kernel density estimates instead.

I recently ran into this issue and tweaked the vioplot() function from the vioplot package by Daniel Adler to make split violin plots. With vioplot2(), the `side`

argument specifies whether to plot the density on “both”, the “left”, or the “right” side.

123456789101112131415161718192021

`require(vioplot)require(devtools)require(digest)source_gist("https://gist.github.com/mbjoseph/5852613")plot(x=NULL, y=NULL, xlim = c(0.5, 2.5), ylim=c(min(values), max(values)), type="n", ann=FALSE, axes=F)axis(1, at=c(1, 2), labels=c("A", "B"))axis(2)for (i in unique(treatment)) { for (j in unique(group)){ vioplot2(values[which(treatment == i & group == j)], at = ifelse(i == "A", 1, 2), side = ifelse(j == 1, "left", "right"), col = ifelse(j == 1, "purple", "lightblue"), add = T) }}title("Violin plot", xlab="Treatment")legend("bottomright", fill = c("purple", "lightblue"), legend = c("Group 1", "Group 2"), box.lty=0)`

Last but not least, Peter Kampstra’s beanplot package uses beanplot() to make split density plots, but 1) plots a rug rather than a quantile box, 2) includes a line for the overall mean or median, and 3) makes it easier to change the kernel function.

123456789

`require(beanplot)beanplot(values ~ group*treatment, ll = 0.04, main = "Bean plot", side = "both", xlab="Treatment", col = list("purple", c("lightblue", "black")), axes=F)axis(1, at=c(1, 2), labels=c("A", "B"))axis(2)legend("bottomright", fill = c("purple", "lightblue"), legend = c("Group 1", "Group 2"), box.lty=0)`

There are more ways than one to skin a cat, and what one uses will probably come to personal preference.

**leave a comment**for the author, please follow the link and comment on his blog:

**Ecology in silico**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...