Nested Forecasting with Spark: Blockchain ETF Trends

R-bloggers 2025-10-15

[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Bitcoin hit an all-time high of $125,664 on October 5. This increase was fueled by a historic net inflow of $3.24 billion into spot Bitcoin ETFs and rising public demand.

In this article, we will predict the trend of two blockchain ETFs using nested forecasting with the Spark backend.

I couldn’t use lagged and smoothing values in recipes because NA caused a problem in the nested data structure.

library(modeltime)library(timetk)library(tidymodels)library(dplyr)library(tidyquant)library(sparklyr)#Connectionsc <- spark_connect(master = "local")#Setup the Spark Backendparallel_start(sc, .method = "spark")#Invesco CoinShares Global Blockchain UCITS ETF (BCHN.L)df_bchn <-   tq_get("BCHN.L") %>%   select(date, 'Invesco CoinShares Global Blockchain' = close) #iShares Blockchain and Tech ETF (IBLC)df_iblc <-   tq_get("IBLC") %>%   select(date, 'iShares Blockchain and Tech' = close) #Creating the survey datadf_survey <-   df_bchn %>%   left_join(df_iblc) %>%   pivot_longer(-date,               names_to = "id",               values_to = "value") %>%   filter(date >= last(date) - months(6)) %>%   drop_na()#Nested Datanested_data_tbl <-   df_survey %>%  dplyr::select(id,                 date = date,                 value = value) %>%  extend_timeseries(    .id_var        = id,    .date_var      = date,    .length_future = 15  ) %>%  nest_timeseries(    .id_var        = id,    .length_future = 15  ) %>%    split_nested_timeseries(    .length_test = 15  )#Modeling  #XGBoostrec_xgb <-   recipe(value ~ ., extract_nested_train_split(nested_data_tbl)) %>%  step_timeseries_signature(date) %>%  step_rm(date) %>%  step_dummy(all_nominal_predictors(), one_hot = TRUE) %>%   step_zv(all_predictors()) %>%  step_impute_linear(all_numeric_predictors())wflw_xgb <-   workflow() %>%  add_model(boost_tree("regression") %>%             set_engine("xgboost")) %>%  add_recipe(rec_xgb)#Prophetrec_prophet <-   recipe(value ~ date, extract_nested_train_split(nested_data_tbl)) %>%   step_date(date, features = c("dow", "month", "year", "doy")) %>%   step_dummy(all_nominal_predictors(), one_hot = TRUE) %>%   step_zv(all_predictors()) %>%  step_impute_linear(all_numeric_predictors())wflw_prophet <-   workflow() %>%  add_model(    prophet_reg("regression") %>%       set_engine("prophet",                 seasonality_yearly = FALSE,                 seasonality_weekly = TRUE,                 seasonality_daily = TRUE)) %>%  add_recipe(rec_prophet)#Nested Forecasting with Sparknested_modeltime_tbl <-   nested_data_tbl %>%  modeltime_nested_fit(    wflw_xgb,    wflw_prophet,    control = control_nested_fit(allow_par = TRUE, verbose = TRUE)  )#Model Test Accuracynested_modeltime_tbl %>%  extract_nested_test_accuracy() %>%  table_modeltime_accuracy(.interactive = T)#Extract Nested Test Accuracybest_nested_modeltime_tbl <-   nested_modeltime_tbl %>%  modeltime_nested_select_best(    metric                = "mape",     minimize              = TRUE,     filter_test_forecasts = TRUE  )#Extract Nested Best Model Reportbest_nested_modeltime_tbl %>%  extract_nested_best_model_report()#Extract Nested Best Test Forecastsbest_nested_modeltime_tbl %>%  extract_nested_test_forecast() %>%  group_by(id) %>%  plot_modeltime_forecast(    .facet_ncol  = 1,    .interactive = FALSE,    .line_size = 1  ) +  labs(title = "Nested Forecasting",        subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span> of <span style = 'color:red;'>Prophet</span> Model",        y = "", x = "") +   facet_wrap(~ id,              ncol = 1,              scales = "free_y") +   scale_y_continuous(labels = scales::label_currency()) +  scale_x_date(labels = scales::label_date("%b'%Y"),               date_breaks = "30 days") +  theme_tq(base_family = "Roboto Slab", base_size = 16) +  theme(plot.subtitle = ggtext::element_markdown(face = "bold"),        plot.title = element_text(face = "bold"),        strip.text = element_text(face = "bold"),        axis.text = element_text(face = "bold"),        axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1),        legend.position = "none")
To leave a comment for the author, please follow the link and comment on their blog: DataGeeek.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continue reading: Nested Forecasting with Spark: Blockchain ETF Trends