Working with Ordinal Ranks in {marginaleffects}

R-bloggers 2025-05-02

[This article was first published on Stat's What It's All About, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Given an ordinal regression model, it is relatively easy to get class-wise predictions – the conditional predicted probability of each level of the outcome. However, often, one might be interested in summarizing effects not on the class-wise probability scale (nor on the latent scale), but instead on the rank scale – the expected ordinal level, expressed as a single number.

The {emmeans} package has the mode = "mean.class" that does just this.

Let’s see how we can do this in {marginaleffects} by utilizing the powerful hypothesis= argument.

Fit a model

library(tidyverse)library(marginaleffects)bfi <- psych::bfi |>   mutate(    gender = factor(gender),    A1 = ordered(A1)  ) |>   drop_na(A1, age, gender)nrow(bfi)#> [1] 2784fit <- MASS::polr(A1 ~ age + gender,                   data = bfi,                  Hess = TRUE)

Class-Wise Predictions

pr1 <- predictions(fit, variables = "gender")pr1#> #>  Group Estimate Std. Error     z Pr(>|z|)     S   2.5 % 97.5 %#>      1   0.1800    0.01136 15.85   <0.001 185.5 0.15777 0.2023#>      1   0.1886    0.01128 16.71   <0.001 205.9 0.16649 0.2107#>      1   0.1843    0.01132 16.28   <0.001 195.6 0.16210 0.2065#>      1   0.1843    0.01132 16.28   <0.001 195.6 0.16210 0.2065#>      1   0.1843    0.01132 16.28   <0.001 195.6 0.16210 0.2065#> --- 33398 rows omitted. See ?print.marginaleffects --- #>      6   0.0289    0.00335  8.64   <0.001 57.3 0.02234 0.0355#>      6   0.0231    0.00264  8.78   <0.001 59.1 0.01798 0.0283#>      6   0.0219    0.00250  8.76   <0.001 58.8 0.01699 0.0268#>      6   0.0207    0.00238  8.71   <0.001 58.1 0.01604 0.0254#>      6   0.0121    0.00165  7.35   <0.001 42.2 0.00891 0.0154#> Type:  probsnrow(pr1) # original data * 2 (counterfactual gender) * 6 (levels of A1)#> [1] 33408

For each observation in the original data frame we now have 12 predictions: - 2 counterfactual predictions for the two levels of gender, times - 6 (arguably also counterfactual) predictions for each of the possible outcome levels.

Sum-scores (mean rank)

We can average the class ranks by their predicted probability to obtain “sum scores”:

$score_i = \sum_{k=1}^{K}{p_{ik}\times k}$ And we can do this for each row in the counterfactual data frame (marked by the rowidcf variable):

sum_score <- function(x) {  x |>     arrange(rowidcf, group, gender) |>     # for each counterfactual row and level of gender    summarise(      term = "sum score",      estimate = sum(estimate * (1:6)),             .by = c(rowidcf, gender)    )}pr2 <- predictions(fit, variables = "gender",                    hypothesis = sum_score)pr2#> #>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %#>      3.02     0.0583 51.8   <0.001 Inf  2.90   3.13#>      2.51     0.0451 55.6   <0.001 Inf  2.42   2.60#>      2.97     0.0553 53.7   <0.001 Inf  2.86   3.08#>      2.46     0.0415 59.4   <0.001 Inf  2.38   2.55#>      3.00     0.0568 52.8   <0.001 Inf  2.88   3.11#> --- 5558 rows omitted. See ?print.marginaleffects --- #>      2.24     0.0305 73.4   <0.001   Inf  2.18   2.30#>      2.67     0.0473 56.4   <0.001   Inf  2.57   2.76#>      2.20     0.0304 72.2   <0.001   Inf  2.14   2.26#>      2.26     0.0645 35.0   <0.001 890.8  2.13   2.38#>      1.85     0.0457 40.6   <0.001   Inf  1.77   1.94#> Term: sum score#> Type:  probsnrow(pr2) # original data * 2 (counterfactual gender)#> [1] 5568

As expected, we now have 2 predictions for each observation in the original data frame: - 2 counterfactual predictions for the two levels of gender

Note that we could have also computed the mean rank for the two levels of gender, or literally anything else. But here I am trying to mimic the basic behavior of the avg_/comparisons() functions by keeping with the workflow:

Compute observation wise counterfactual predictions
Contrast them for each observation
(Possibly) average them

So let’s get those contrasts!

Contrast the ranks

sum_score.contr <- function(x) {  # same as before, and...  sum_score(x) |>     # compute a single contrast for each counterfactual row    summarise(      term = "gender (2-1)",      estimate = estimate[gender == "2"] - estimate[gender == "1"],             .by = rowidcf    )}

This function will compute a difference between the rank for each gender for each observation.

pr3 <- predictions(fit, variables = "gender",                    hypothesis = sum_score.contr)pr3#> #>  Estimate Std. Error     z Pr(>|z|)    S  2.5 % 97.5 %#>    -0.511     0.0588 -8.68   <0.001 57.8 -0.626 -0.395#>    -0.506     0.0584 -8.67   <0.001 57.7 -0.621 -0.392#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#> --- 2774 rows omitted. See ?print.marginaleffects --- #>    -0.504     0.0582 -8.67   <0.001 57.6 -0.618 -0.390#>    -0.483     0.0560 -8.62   <0.001 57.1 -0.593 -0.373#>    -0.477     0.0554 -8.61   <0.001 56.9 -0.586 -0.368#>    -0.471     0.0548 -8.59   <0.001 56.7 -0.578 -0.363#>    -0.403     0.0488 -8.26   <0.001 52.6 -0.499 -0.307#> Term: gender (2-1)#> Type:  probsnrow(pr3) # original data#> [1] 2784

Average Contrast

Finally, we can get the average difference:

# average contrast avg_sum_score.contr <- function(x) {  sum_score.contr(x) |>     summarise(      term = "avg. gender (2-1)",      estimate = mean(estimate)    )}predictions(fit, variables = "gender",             hypothesis = avg_sum_score.contr)#> #>  Estimate Std. Error     z Pr(>|z|)    S  2.5 % 97.5 %#>    -0.474     0.0551 -8.61   <0.001 56.9 -0.582 -0.366#> #> Term: avg. gender (2-1)#> Type:  probs

POMP

POMP (percent of maximum possible) are a for of standardized units for Likert-type items (see Solomon Kurz’s excellent blog post for more – better – info).

These are linear transformations of the ranks described above:

$POMP_i = \frac{score_i - \text{min}}{\text{max} - \text{min}}$ We can obtain POMPs for each row in the counterfactual data frame by further processing the sum-scores:

POMP <- function(x) {  sum_score(x) |>     mutate(      estimate = 100 * (estimate - 1) / (6 - 1)    )}predictions(fit, variables = "gender",             hypothesis = POMP)#> #>  Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %#>      40.4      1.166 34.6   <0.001 870.6  38.1   42.7#>      30.2      0.902 33.5   <0.001 812.8  28.4   31.9#>      39.4      1.107 35.6   <0.001 920.9  37.3   41.6#>      29.3      0.829 35.3   <0.001 905.5  27.7   30.9#>      39.9      1.135 35.1   <0.001 896.3  37.7   42.1#> --- 5558 rows omitted. See ?print.marginaleffects --- #>      24.7      0.609 40.6   <0.001   Inf  23.5   25.9#>      33.3      0.945 35.3   <0.001 902.9  31.5   35.2#>      23.9      0.608 39.3   <0.001   Inf  22.7   25.1#>      25.2      1.289 19.5   <0.001 279.4  22.6   27.7#>      17.1      0.914 18.7   <0.001 256.9  15.3   18.9#> Term: sum score#> Type:  probs

And likewise get contrasts -

POMP.contr <- function(x) {  POMP(x) |>     summarise(      term = "gender (2-1)",      estimate = estimate[gender == "2"] - estimate[gender == "1"],             .by = rowidcf    )}predictions(fit, variables = "gender",             hypothesis = sum_score.contr)#> #>  Estimate Std. Error     z Pr(>|z|)    S  2.5 % 97.5 %#>    -0.511     0.0588 -8.68   <0.001 57.8 -0.626 -0.395#>    -0.506     0.0584 -8.67   <0.001 57.7 -0.621 -0.392#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#>    -0.509     0.0586 -8.68   <0.001 57.8 -0.623 -0.394#> --- 2774 rows omitted. See ?print.marginaleffects --- #>    -0.504     0.0582 -8.67   <0.001 57.6 -0.618 -0.390#>    -0.483     0.0560 -8.62   <0.001 57.1 -0.593 -0.373#>    -0.477     0.0554 -8.61   <0.001 56.9 -0.586 -0.368#>    -0.471     0.0548 -8.59   <0.001 56.7 -0.578 -0.363#>    -0.403     0.0488 -8.26   <0.001 52.6 -0.499 -0.307#> Term: gender (2-1)#> Type:  probs

And average contrasts -

# average contrast avg_POMP.contr <- function(x) {  POMP.contr(x) |>     summarise(      term = "avg. gender (2-1)",      estimate = mean(estimate)    )}predictions(fit, variables = "gender",             hypothesis = avg_POMP.contr)#> #>  Estimate Std. Error     z Pr(>|z|)    S 2.5 % 97.5 %#>     -9.49        1.1 -8.61   <0.001 56.9 -11.6  -7.33#> #> Term: avg. gender (2-1)#> Type:  probs

A note for Bayesians

Hey, Bayesians, how are you doing?

I have some bad news: {marginaleffects} does not support hypothesis=<function>.

But I also have some good news! You can basically do all of this by getting the posterior draws in an rvar format, and then directly manipulating the posterior(s) - so the examples above should basically work out of the box.

For example:

library(brms)library(posterior)fit_b <- brm(A1 ~ age + gender,              data = bfi,             family = cumulative(),             prior = NULL) # obviously this is a bad priorpr1_b <- predictions(fit_b, variables = "gender")

We just need to adapt the function(s) written above to work properly with rvars. I’ve marked

avg_POMP.contr_b <- function(x) {  x |>     arrange(rowidcf, group, gender) |>     # for each counterfactual row and level of gender    summarise(      term = "sum score",      # estimate = sum(estimate * (1:6)), 1      estimate = rvar_sum(rvar * (1:6)),            .by = c(rowidcf, gender)    ) |>     # POMP    mutate(      estimate = 100 * (estimate - 1) / (6 - 1)    ) |>     # contrasts    summarise(      term = "gender (2-1)",      estimate = estimate[gender == "2"] - estimate[gender == "1"],             .by = rowidcf    ) |>     # average    summarise(      term = "avg. gender (2-1)",      # estimate = mean(estimate)2      estimate = rvar_mean(estimate)    )}get_draws(pr1_b, shape = "rvar") |>  avg_POMP.contr_b()#>                term   estimate#> 1 avg. gender (2-1) -9.4 ± 1.1

1: Use the rvar column and the rvar_sum() function (instead of sum())
2: Use the rvar_mean() function (instead of mean())

To leave a comment for the author, please follow the link and comment on their blog: Stat's What It's All About.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Continue reading: Working with Ordinal Ranks in {marginaleffects}