Plotting lm and glm models with ggplot #rstats

R-bloggers 2013-03-22

(This article was first published on Strenge Jacke! » R, and kindly contributed to R-bloggers)

Summary In this posting I will show how to plot results from linear and logistic regression models (lm and glm) with ggplot. As in my previous postings on ggplot, the main idea is to have a highly customizable function for representing data. You can download all my scripts from my script page.

The inspiration source My following two functions are based on an idea which I saw at the Sustainable Research Blog. Actually, this was a kind of starting point for me to get started with R and learn more about its data visualization facilities. After playing around some time with ggplot, I built my own function based on the script posted at Sustainable Research.

Plotting odds ratios Plotting odds ratios gives you mainly two display styles: bars or plots (dots). First, let me show you the dot-style. Assuming you have a glm-object (in my examples, it’s called logreg) and have loaded the function sjPlotOdds.R (see my script page for downloads), you can plot the results like this:

plotOdds(logreg,         oddsLabels=lab,         axisLimits=c(0.4, 4.0),         gridBreaksAt=0.4)
Odds ratios as dots, with confidence intervals, "positive" effects (> 1) in blue.

Odds ratios as dots, with confidence intervals, “positive” effects (> 1) in blue.

As you can see in the example above, I have specified the axis limits and the grid breaks. If you do not specifiy axis limits, the boundaries will be calculated according to the lowest and highest confidence interval, thus fitting the diagram to the highest possible “zoom”. The next example demonstrates this with bar charts:

plotOdds(logreg,         oddsLabels=lab,         type="bars",         gridBreaksAt=0.5)
Odds ratios with confidence intervals, fitting the axes to maximum "zoom".

Odds ratios with confidence intervals, fitting the axes to maximum “zoom”.

Both diagrams contain model summaries in the lower right corner. You can change many visual parameters, for instance hiding the summary, changing bar colors, changing border or background colors, line and bar size etc.

Plotting betas and standardized betas of linear regressions Quite similar is my function sjPlotLinreg.R which visualizes the results of linear regressions. Thus, it requires a lm-object.

plotBetas(x=linreg,          axisLimits=c(-0.5, 0.9),          xAxisLabel="beta (blue) and std. beta (red)",          sort="std",          predictorLabels=lab,          predictorLabelSize=1,          breakLabelsAt=30)
Linear regression, with beta-values and confidence intervals (in blue) as well as standardized beta values (in red)

Linear regression, with beta-values and confidence intervals (in blue) as well as standardized beta values (in red)

As you can see, I have used predictorLabelSize=1 and breakLabelsAt=30 due to the long variable labels. By default, each label at the left axis would break into more lines, thus being narrower and worse to read. Then I used sort=”std” to sort the odds ratios according to their standardized beta values (default would be ordering according to the beta values).

plotBetas(x=linreg,          predictorLabels=lab,          predictorLabelSize=1,          breakLabelsAt=30,          showStandardBeta=FALSE)
Linear regression, only beta values shown

Linear regression, only beta values shown

The showStandardBeta=FALSE makes the red dots (standardized beta values) and their connecting line disappear.

plotBetas(x=linreg,          predictorLabels=lab,          predictorLabelSize=1,          breakLabelsAt=30,          showValues=FALSE,          showPValues=FALSE)
Linear regression, beta and standardized beta values are shown, value labels hidden.

Linear regression, beta and standardized beta values are shown, value labels hidden.

This last example shows how to hide the value labels inside the diagram, so you only have the dots for beta and standardized beta coefficients.

Last remark In between I have also updated my other scripts. For instance, the sjPlotGroupFrequencies.R function can now also plot box plots or violin plots (see examples at the end of that posting). So make sure you have the latest version from my script page.

Tagged: ggplot, R, regression analysis, rstats, Statistik

To leave a comment for the author, please follow the link and comment on his blog: Strenge Jacke! » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...