simaerep release 1.0.0

R-bloggers 2025-11-05

[This article was first published on R on datistics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Simulate patient-related events in clinical trials with the goal of detecting over and under-reporting sites.

Monitoring reporting rates of patient-related events such as adverse events (AE) in clinical trials is important for patient safety. We use bootstrap-based simulation to assign over- and under-reporting probabilities to each site in a clinical trial. The method is inspired by the ‘infer’ R package and Allen Downey’s blog article: “There is only one test!”.

Key Risk Indicators

Statistical monitoring of clinical trial sites typical employs several key risk indicators which are operational metrics derived from activities at the clinical trial sites. The ratioof AEs is typically one of them, the ratio of issues occurring when treating and examininig patients according to the study protocol might be another one.

Release v1.0.0

Announcement

We are happy to announce the release of {simaerep} v1.0.0 and {gsm.simarep} v0.2.0 our open source R packages designed to detect clinical trial sites that are under- or over- reporting patient-level clinical events. We have condensed our experience and user feedback from the past 5 years to design a more professional user experience with default settings that reflect our latest recommendations. We have also adapted the algorithm to support the detection of over-reporting of low-incidence terminal events such as patient discontinuations. {gsm.simaerep} provides important data preprocessing functions and a standardized approach to integrate {simaerep} into an end-to-end analysis and reporting pipeline using the good statistical monitoring {gsm} framework.

Release Highlights

  • New user interface for simaerep() function with defaults reflecting the latest recommendations
  • Better output structure, over- and under-reporting probability combined into one score
  • Support for low-incidence terminal events (e.g., patient discontinuations)

Detailed Releas Notes

Demo

{simaerep}

suppressPackageStartupMessages(library(simaerep))suppressPackageStartupMessages(library(dplyr))suppressPackageStartupMessages(library(knitr))set.seed(1)df_visit <- sim_test_data_study(  n_pat = 1000, # number of patients in study  n_sites = 100, # number of sites in study  ratio_out = 0.02, # ratio of sites with outlier  factor_event_rate = -0.5, # rate of under-reporting  # non-constant event rates based on gamma distribution  event_rates = (dgamma(seq(1, 20, 0.5), shape = 5, rate = 2) * 5) + 0.1,  max_visit = 20,  max_visit_sd = 10,  study_id = "A")df_visit %>%  select(study_id, site_id, patient_id, visit, n_event) %>%  head(25) %>%  knitr::kable()
study_idsite_idpatient_idvisitn_eventAS0001P00000110AS0001P00000122AS0001P00000132AS0001P00000144AS0001P00000156AS0001P00000167AS0001P00000177AS0001P00000187AS0001P00000197AS0001P000001107AS0001P000001117AS0001P000001127AS0001P000001137AS0001P00000213AS0001P00000223AS0001P00000235AS0001P00000248AS0001P00000258AS0001P00000269AS0001P00000279AS0001P00000289AS0001P00000299AS0001P000002109AS0001P000002119AS0001P000002129
evrep <- simaerep(df_visit, mult_corr = TRUE)plot(evrep, study = "A")

{gsm.simaerep}

library(gsm.simaerep)library(gsm.kri)dfInput <- Input_CumCount(  dfSubjects = clindata::rawplus_dm,  dfNumerator = clindata::rawplus_ae,  dfDenominator = clindata::rawplus_visdt %>% dplyr::mutate(visit_dt = lubridate::ymd(visit_dt)),  strSubjectCol = "subjid",  strGroupCol = "siteid",  strGroupLevel = "Site",  strNumeratorDateCol = "aest_dt",  strDenominatorDateCol = "visit_dt")dfAnalyzed <- Analyze_Simaerep(dfInput)dfFlagged <- Flag_Simaerep(dfAnalyzed, vThreshold = c(-0.99, -0.95, 0.95, 0.99))## ℹ Sorted dfFlagged using custom Flag order: 2.Sorted dfFlagged using custom Flag order: -2.Sorted dfFlagged using custom Flag order: 1.Sorted dfFlagged using custom Flag order: -1.Sorted dfFlagged using custom Flag order: 0.#> ℹ Sorted dfFlagged using custom Flag order: 2.Sorted dfFlagged using custom Flag order: -2.Sorted dfFlagged using custom Flag order: 1.Sorted dfFlagged using custom Flag order: -1.Sorted dfFlagged using custom Flag order: 0.gsm.kri::Visualize_Scatter(  dfFlagged,  dfBounds = NULL,  strGroupLabel = "GroupLevel",  strUnit = "Visits")

IMPALA

simaerep has been published as workproduct of the Inter-Company Quality Analytics (IMPALA) consortium. IMPALA aims to engage with Health Authorities inspectors ondefining guiding principles for the use of advanced analytics to complement, enhance and accelerate current QA practices. simaerep has initially been developed at Roche but is currently evaluated byother companies across the industry to complement their quality assurance activities (see testimonials).

IMPALA logo

Ressources

Publications

Koneswarakantha, B., Adyanthaya, R., Emerson, J. et al. An Open-Source R Package for Detection of Adverse Events Under-Reporting in Clinical Trials: Implementation and Validation by the IMPALA(Inter coMPany quALity Analytics) Consortium. Ther Innov Regul Sci (2024). https://doi.org/10.1007/s43441-024-00631-8

Koneswarakantha, B., Barmaz, Y., Ménard, T. et al. Follow-up on the Use of Advanced Analytics for Clinical Quality Assurance: Bootstrap Resampling to Enhance Detection of Adverse Event Under-Reporting. DrugSaf (2020). https://doi.org/10.1007/s40264-020-01011-5

To leave a comment for the author, please follow the link and comment on their blog: R on datistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continue reading: simaerep release 1.0.0