Design considerations for subgroup analyses in cluster-randomized trials based on aggregated individual-level predictors


In research assessing the effect of an intervention or exposure, a key secondary objective often involves assessing differential effects of this intervention or exposure in subgroups of interest; this is often referred to as assessing effect modification or heterogeneity of treatment effects (HTE). Observed HTE can have important implications for policy, including intervention strategies (e.g., will some patients benefit more from intervention than others?) and prioritizing resources (e.g., to reduce observed health disparities). Analysis of HTE is well understood in studies where the independent unit is an individual. In contrast, in studies where the independent unit is a cluster (e.g., a hospital or school) and a cluster-level outcome is used in the analysis, it is less well understood how to proceed if the HTE analysis of interest involves an individual-level characteristic (e.g., self-reported race) that must be aggregated at the cluster level. Through simulations, we show that only individual-level models have power to detect HTE by individual-level variables; if outcomes must be defined at the cluster level, then there is often low power to detect HTE by the corresponding aggregated variables. We illustrate the challenges inherent to this type of analysis in a study assessing the effect of an intervention on increasing COVID-19 booster vaccination rates at long-term care centers.