Managing memory in a list of lists data structure
R-bloggers 2013-04-03
First, a confession: instead of using classes and defining methods for them, I build a lot of ad hoc data structures out of lists and then build up one-off methods that operate on those lists of lists. I think this is a perl-ism that has transferred into my R code. I might eventually learn how to do classes, but this hack has been working well enough.
One issue I ran into today is that it was getting tedious to find out which objects stored in the list of lists was taking up the most memory. I ended up writing this rather silly recursive function that may be of use to you if you also have been scarred by perl.
get.size <- function( obj.to.size, units='Kb') { # Check if the object we were passed is a list # N.B. Since is(list()) returns c('list', 'vector') we need a # multiple value comparison like all.equal # N.B. Since all.equal will either return TRUE or a vector of # differences wrapping it in is.logical is the same as # checking if it returned TRUE. if ( is.logical( all.equal( is(obj.to.size) , is(list())))) { # Iterate over each element of the list lapply( obj.to.size , function(xx){ # Calculate the size of the current element of the list # N.B. object.size always returns bytes, but its print # allows different units. Using capture.output allows # us to do the conversion with the print method the.size <- capture.output(print(object.size(xx), units=units)) # This object may itself be a list... if( is.logical( all.equal( is(xx), is(list())))) { # if so, recurse the.rest <- get.size( xx , units) return( list(the.size, the.rest) ) } else { # if not, return its size return( the.size) } }) } else { # If the object wasn't a list, return an error. stop("The object passed to this function was not a list.") } }
The output looks something like this
$models $models[[1]] [1] "2487.7 Kb" $models[[2]] $models[[2]]$naive.model [1] "871 Kb" $models[[2]]$clustered.model [1] "664.5 Kb" $models[[2]]$gls.model [1] "951.9 Kb" $V [1] "4628.2 Kb" $fixed.formula [1] "1.2 Kb" $random.formula [1] "2.6 Kb"
where the first element of the list is the sum of everything below it in the hierarchy. Therefore, the whole “models” is 2487.7 Kb and “models$naive.model” is only 871 Kb of that total.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...