column-store R or: how i learned to stop worrying and love monetdb

R-bloggers 2013-03-18

Summary:

(This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers) "Combining R's sophisticated calculations and MonetDB's excellent data access performance is a no-brainer. One gets the best of two (open source) worlds with minimal hassle." - Dr. Hannes Mühleisen"oh wow that was fast like a cheetah with a jetpack or something" - anthony damicowhy try monetdb + ra speed test of four analysis commands on sixty-seven million physician visit records using my personal laptop --# calculate the sum of a single variable..system.time( print( sum( carrier08$car_hcpcs_pmt_amt ) ) )[1] 3477564780   user  system elapsed    0.00    0.00    0.04 seconds   # ..or calculate the sum, mean, median, and count of a single variable with sqlsystem.time( dbGetQuery( db , 'select sum( car_hcpcs_pmt_amt ), avg( car_hcpcs_pmt_amt ), median( car_hcpcs_pmt_amt ), count(*) from carrier08' ) )   user  system elapsed    0.01    0.00   16.86 seconds # calculate the same statistics, broken down by six age and two gender categoriessystem.time( dbGetQuery( db , 'select bene_sex_ident_cd, bene_age_cat_cd, sum( car_hcpcs_pmt_amt ), avg( car_hcpcs_pmt_amt ), median( car_hcpcs_pmt_amt ), count(*) from carrier08 group by bene_sex_ident_cd, bene_age_cat_cd' ) )   user  system elapsed    0.00    0.01   36.03 seconds# calculate the same statistics, broken down by six age, two gender,# [...]

Link:

http://feedproxy.google.com/~r/RBloggers/~3/HX7Htew4CQA/

From feeds:

Statistics and Visualization » R-bloggers

Tags:

Authors:

Anthony Damico

Date tagged:

03/18/2013, 13:02

Date published:

03/18/2013, 02:00