Assessing the Price of Solid State Harddrives

ggplot2 2013-03-15

Summary:

Over a staff meeting at work, the topic of price of solid state harddrives came up (what are they, is it non linear with size, etc.). Idecided to sample 120 solid state hard drives from newegg.com andrecorded their size (in GB) and price (in USD) as well as their class(SATA II or SATA III). Note that the sampling was semi-random, inthat I had no particular agenda, but did not go to great lengths tosample randomly. To look at this, I used ggplot2.

 ssd <- read.csv("http://joshuawiley.com/files/ssd.csv") ssd$class <- factor(ssd$class) require(ggplot2) ## first pass p <- ggplot(ssd, aes(x = price, y = size, colour = class)) +     geom_point() print(p)

Scatter plot of Size and Price of SSDs

Not too bad, but the data is sparser at higher sizes and prices, so wecan use a log-log scale to make it a little easier to see, and addlocally weighted regression (loess) lines to assess linearity (or lackthere of).

 ## add smooths and log to make clearer p <- p +  stat_smooth(se=FALSE) +  scale_x_log10(breaks = seq(0, 1000, 100)) +  scale_y_log10(breaks = seq(0, 600, 100))

Scatter plot of Size and Price of SSDs in log 10scale with loess smooth lines

Okay, that is nice. Lastly, let’s add better labels, make the x-axistext not overlap, and include the intercept and slope parameters forthe linear lines of best fit for each class of hard drive.

 ## fit separate intercept and slope model m <- lm(size ~ 0 + class*price, data = ssd) est <- round(coef(m), 2) size2 <- paste0("II Size = ", est[1], " + ", est[3], "price") size3 <- paste0("III Size = ", est[2], " + ", est[4], "price") ## finalize p <- p +  annotate("text", x = 100, y = 600, label = size2) +  annotate("text", x = 100, y = 500, label = size3) +  labs(x = "Price in USD", y = "Size in GB") +  opts(title = "Log-Log Plot of SSD Size and Price",       axis.text.x = theme_text(angle = 45, hjust = 1, vjust = 1))

Fancy Scatter plot of Size and Price of SSDs in log10 with loess smooth lines

(guest post by Joshua Wiley)

Link:

http://blog.ggplot2.org/post/25356897293

From feeds:

Statistics and Visualization » ggplot2

Tags:

Authors:

baptiste-auguie

Date tagged:

03/15/2013, 20:07

Date published:

06/18/2012, 06:53