Covariate selection and adaptive randomization for sequential experiments

Lead Investigator: Li Yang, Renmin University of China
Title of Proposal Research: Covariate selection and adaptive randomization for sequential experiments
Vivli Data Request: 9771
Funding Source: None
Potential Conflicts of Interest: None

Summary of the Proposed Research:

Background
A randomized controlled trial (RCT) is a type of study design frequently used in the field of medicine and social sciences to evaluate the effectiveness of an intervention or treatment. In an RCT, participants are randomly assigned to different groups: the treatment group, which receives the intervention being studied, and the control group, which does not receive the intervention (or receives a placebo or standard treatment). By randomly assigning participants, RCTs aim to minimize the influence of confounding factors and ensure that any observed differences between the groups can be attributed to the intervention.

Covariate balance refers to the similarity of participant characteristics between the treatment and control groups in an RCT. Covariates are variables that may affect the outcome of interest, and balancing them between the groups helps ensure that any observed differences in the outcome are due to the intervention and not due to pre-existing differences in participant characteristics. It is important to achieve covariate balance because imbalances in baseline characteristics can confound the treatment effect estimates[1].

Covariate adaptive randomization (CAR) is a technique used in RCTs to improve the covariate balance across treatment groups. In traditional randomization, participants are assigned to treatment groups randomly or with a fixed allocation ratio. This method could not guarantee covariate balance. Two surveys on covariate balance in RCTs published in four leading medical journals have shown that about 4% to 6% covariates were significantly imbalanced[2,3]. In CAR, the allocation of participants to treatment groups is influenced by certain participant characteristics or covariates. The goal of CAR is to ensure that the distribution of covariates is balanced between treatment groups, which can help reduce potential confounding and increase the precision of treatment effect estimates. The randomization process takes into account the observed values of relevant covariates for each participant and adjusts the allocation probabilities accordingly.

Necessity of research
Despite that there have been many CAR methods proposed in literature[4], one major limitation is they all assume important covariates are known and fixed, which is not often the case in reality. In practice, researchers often need to select important prognostic factors from a large number of candidates, such as demographic and disease characteristics, patient history, biomarkers, and genetic information. This is especially true in the era of big data and precision medicine as a diverging number of variables have become available. The selection of covariates is usually determined based on existing medical literature, but sometimes evidence from historical studies could be scarce or controversial. Therefore, the problem remains as to how to select the important prognostic factors that are predictive of treatment outcome.

A sequential experiment is a type of study design in which data is collected and analyzed in an ongoing, sequential manner, allowing for adaptive decision-making throughout the study. Sequential randomization refers to a method of assigning participants to different treatment groups in a sequential experiment. In this approach, participants are allocated to treatment groups in an ongoing, adaptive manner based on the accumulating data at each stage of the study. Many contemporary RCTs are conducted globally at multiple sites with recruitment periods that span several months or even years. In this sequential framework, it’s not uncommon for earlyenrolled patients to have their outcome data collected and assessed before other subjects enter the trial. This information allows the detection of important covariates and provides the opportunity to improve the balance of these covariates during randomization.

This research proposes a novel covariate adaptive randomization method in sequential clinical trials that identifies and balances important covariates during randomization. As the RCT progresses and data is collected, important covariates are selected by the proposed method, and the randomization process is adjusted to ensure that the treatment groups remain balanced in terms of these covariates.

The proposed method provides a tool for future RCTs, and is especially valuable for trials with little prior knowledge on important covariates to balance. Under the proposed method, the statistical inference on treatment effect is much more efficient and easier to interpret because covariates are so well-balanced that a simple comparison of outcome between groups will be sufficient.

Study design
As this is a statistical methodology research, a randomization algorithm has been designed and its performance has been examined through extensive numerical simulation studies. This request is made in order to test the performance of the proposed method on the real dataset from a clinical trial.

Requested Studies:

A Phase III Randomised, Double-blind Trial to Evaluate Efficacy and Safety of Once Daily Empagliflozin 10 mg Compared to Placebo, in Patients With Chronic Heart Failure With Preserved Ejection Fraction (HFpEF)
Data Contributor: Boehringer Ingelheim
Study ID: NCT03057951
Sponsor ID: 1245.110