Defining a new transformation for ggplot2/scales - Part II
ggplot2 2013-03-15
Summary:
In my previous blog post, I explored what was needed to create a new transformation for the scales package and gave an example of a mathematical transformation. In this post, I want to show an additional example related to the other mentioned use case (mapping a continuous like variable with specific structure and formatting) and extend the example into creating new scales functions which integrate into ggplot even more directly.
Time
Dates and times are tricky to work with because they have detailed external constraints and conventions. Within the R ecosystem, several packages exist solely to deal with dates and times (chron, lubridate, date, mondate, timeDate, TimeWarp, etc.), and an article has appeared in R News on the topic (Brian D. Ripley and Kurt Hornik. Date-time classes. R News, 1(2):8-11, June 2001.).
There is already support for dates (using the Date
class, via date_trans
in scales
and scale_*_date
in ggplot2
) and datetimes (using the POSIXt
class, via time_trans
in scales
and scale_*_datetime
in ggplot2
). The piece that is missing is for time, separate from any date; “clock time”, if you will.
Existing solutions
Exercising the first of the three great virtues of a programmer, laziness, it is worth seeing what has already been done (classes and functions) to deal with clock time.
The chron
package has a class times
which can specify times of day, independent of a date. Additionally, there are many supporting functions for this class:
> methods(class="times")
[1] [.times* [[.times* [<-.times*
[4] as.character.times* as.data.frame.times* axis.times*
[7] Axis.times* c.times* diff.times*
[10] format.times* hist.times* identify.times*
[13] is.na.times* lines.times* Math.times*
[16] mean.times* Ops.times* plot.times*
[19] points.times* pretty.times* print.times*
[22] quantile.times* summary.times* Summary.times*
[25] trunc.times* unique.times* xtfrm.times*
Non-visible functions are asterisked
Following the pattern of the previous post, each of the parts of the transformation can be determined.
transform
and inverse
When dealing with variable that is a class, transform
must take the specific representation and convert it to a simple numeric representation (map to [part of] the real line in mathematical terms); inverse
does the opposite functional mapping. Generally, this requires delving into the structure of the class to see how it is really put together. To do that, let’s create some data. The times
documentation says it can convert a character vector (by default in 24-hour, minute, second format, separated by colons) to times.
Time <- times(c("18:37:11", "16:51:34", "15:05:57", "13:20:20",
"11:34:43", "09:49:06", "08:03:29", "06:17:52",
"04:32:15", "02:46:38", "01:01:01"))
which if printed gives
> Time
[1] 18:37:11 16:51:34 15:05:57 13:20:20 11:34:43 09:49:06
[7] 08:03:29 06:17:52 04:32:15 02:46:38 01:01:01
So far, so good. But what does this object/class really look like?
> str(Time)
Class 'times' atomic [1:11] 0.776 0.702 0.629 0.556 0.482 ...
..- attr(*, "format")= chr "h:m:s"
> dput(Time)
structure(c(0.775821759259259, 0.702476851851852, 0.629131944444444,
0.555787037037037, 0.48244212962963, 0.409097222222222, 0.335752314814815,
0.262407407407407, 0.1890625, 0.115717592592593, 0.0423726851851852
), format = "h:m:s", class = "times")
times
are just vectors with an attribute and a class. A little more digging and testing can show that the numeric part is just the fraction of a day that that time represents.
> str(times(c("00:00:00","6:00:00","12:00:00","23:59:59")))
Class 'times' atomic [1:4] 0 0.25 0.5 1
..- attr(*, "format")= chr "h:m:s"
> dput(times(c("00:00:00","6:00:00","12:00:00","23:59:59")))
structure(c(0, 0.25, 0.5, 0.999988425925926), format = "h