The integration of clinical trial data, real-world evidence, and novel statistical methods to optimise control arm recruitment in relapsed and refractory multiple myeloma

Lead Investigator: Lewis Carpenter, Arcturis
Title of Proposal Research: The integration of clinical trial data, real-world evidence, and novel statistical methods to optimise control arm recruitment in relapsed and refractory multiple myeloma
Vivli Data Request: 8997
Funding Source: All researchers are paid employees of Arcturis. This project is being carried out as part of Arcturis’ internal development.
Potential Conflicts of Interest: Provide consultancy with many pharmaceutical companies as part of role at Arcturis.
All analyses will be pre-planned as part of a study protocol, which will be published in a real-world evidence registry where possible. Any conflicts that arise will be declared in any subsequent publications.

Summary of the Proposed Research:

Multiple myeloma (MM) is a type of blood cancer which affects the way white blood cells make antibodies. It’s the second most common blood cancer, and accounts for 1% of all cancers and about 10% of all blood cancers. Compared to people without MM, patients with multiple myeloma are only 55% as likely to survive for at least 5 years. There is no known cure, however new treatments developed in recent years have allowed patients to live longer. Unfortunately, most patients’ disease will eventually stop responding to treatment. These patients will have to move on to other treatments, but available treatments for these patients are still limited.

Randomised-controlled trials (RCTs) are research studies used to determine how effective a new treatment is in comparison to a different, more established treatment. In an RCT, patients are randomly assigned (like flipping a coin) to either receive the new treatment or a control treatment, which could be a proven treatment or placebo, i.e. something that looks like medicine but has no medicinal effects. This helps researchers compare the benefits and side-effects between the two. Sometimes it’s not possible to do this kind of study; for example, if a new treatment seems to be highly effective and safe from preliminary studies, if would be unethical to not provide it to those eligible to receive it. In these cases, researchers will instead use existing data from previous studies or electronic health records to create a control group. This approach allows the new treatment to be compared to a control treatment, but the type of people receiving the new treatment and control may be different, so the comparison may not be as accurate as in an RCT. In this study, we will use electronic health records (EHR) to create an External Control Arm (ECA) and compare it to the existing control group in the ICARIA-MM trial. An ECA can either replace or add to an existing control group and is useful for proving how well the new treatment works. However, finding good data to create ECAs can be challenging, especially when it comes to rare or uncommon diseases. A consequence of this is that there still uncertainty as to how ECAs should be developed when trying to provide a comparison for a trial, and how ECA data should be analysed alongside trial data.

The primary goal of this study will be to determine scenarios in which construction of an ECA can lead to accurate assessment of the relative effectiveness of new treatments when compared to treatments that are currently in use. The aims of this study will be achieved by constructing an ECA dataset using de-identified routine electronic health data for over 6700 MM patients available through the Arcturis Data Platform. This dataset will allow us to choose patients who could have been included in the ICARIA-MM trial and evaluate common trial outcomes such as how long a patient survives after receiving treatment. This mimics an RCT without the additional burden on patients and associated recruitment costs of an actual control group.

We will compare the benefits of the new treatment in the ICARIA-MM study against the ECA. We will show differences in survival between the two groups using Kaplan-Meier curves. To account for any differences between the patients in both groups, we will use statistical techniques like inverse probability weighting that adjust the data based on each individual patient’s characteristics. Additionally, we will use a method called Bayesian borrowing to investigate how we should select patients for the ECA based on the characteristics of the original control group of the trial.

Statistical Analysis Plan:

Naïve and Adjusted ECA Analyses

Covariate balance in the ECA and IPD of NCT02990338 will be assessed through the calculation of summary statistics and the visualisation of empirical covariate distributions. Additionally, outcomes in the ECA and both the intervention and control arms of NCT02990338 will be compared descriptively through the visualisation of Kaplan-Meier curves. Missing data below 10% will be excluded, and above 10% imputation techniques will be considered. For time-to-event analyses, event and censoring definitions in the ECA will be implemented as described in the study protocol for NCT02990338. Once descriptive analyses have been performed, a qualitative comparison of the naïve outcome distribution for the ECA and NCT02990338 control cohort will be performed. A comparative effectiveness analysis will be performed using the log-rank test between IsaPomDex exposed patients in NCT02990338 and ECA PomDex exposed patients.

Key covariates that are expected to confound the causal relationship between intervention exposure (i.e., NCT02990338 membership or ECA membership) and progression-free survival or overall survival will be included in a propensity score model. Propensity scores will be estimated for each individual using logistic regression, with IsaPomDex exposure as the binary outcome variable. Estimated propensity scores will then be converted into inverse probability of treatment weights and included in a marginal Cox-proportional hazards model with IsaPomDex exposure as the solitary covariate. Violation of the proportional hazards assumption will be assessed and a weighted restricted mean survival approach will be used if clear proportional hazards violations are identified. The estimated treatment effects from these models, and reweighted Kaplan-Meier curves, will then be compared to those obtained from the naïve comparison.

Sensitivity analyses in which the definition of progression is relaxed in the NCT02990338 IPD and the EHR ECA will also be performed to investigate the effect of deviating from the definition of progression commonly used in clinical trials.

Bayesian Borrowing

All Bayesian borrowing analyses will utilise the commensurate prior approach to borrowing information from the ECA to reduce the number of control cohort patients required from NCT02990338. Ten percent of the NCT02990338 control cohort will be supplemented with cohort members from the ECA, with both progression-free survival and overall survival assessed as outcomes. The outcome regression model will consist of a Bayesian Weibull regression, with covariates included in the linear predictor. The results from this initial borrowing analysis will be compared to results obtained from further borrowing analyses in which individual inclusion/exclusion criteria will be dropped when constructing the ECA cohort. The variance and posterior median of the marginal posterior distribution for the regression coefficient for IsaPomDex exposure will be retained for each of these analyses and compared.

Further analyses will be performed with the ECA cohort constructed using the complete inclusion/exclusion criteria for NCT02990338 and with covariates included in the regression model reflecting covariate imbalances that are identified between the NCT02990338 IsaPomDex cohort and the control cohort constructed from both ECA and NCT02990338 PomDex individuals. The IPD from NCT02990338 will be required to enable suitable assessment of covariate imbalance and subsequent covariate adjustment within the Bayesian regression model. The resulting marginal posterior distribution of treatment effect will be compared to the outcomes reported in NCT02990338.

Requested Studies:

A Phase 3 Randomized, Open-label, Multicenter Study Comparing Isatuximab (SAR650984) in Combination With Pomalidomide and Low-Dose Dexamethasone Versus Pomalidomide and Low-Dose Dexamethasone in Patients With Refractory or Relapsed and Refractory Multiple Myeloma
Data Contributor: Sanofi
Study ID: NCT02990338
Sponsor ID: EFC14335

Summary of Results:

Assessment of the content of the ICARIA-MM biochemistry data identified that many of the biochemistry tests required to identify the presence of progressive disease were not recorded at the start of each cycle of therapy. The specific quantified concentration of kappa and lambda sFLC are a required aspect of the IMWG progression criteria and are a regularly recorded biochemistry test result in real-world datasets. The frequency with which sFLC is performed in real clinical practice make it key for assessing real-world progression, particularly when myeloma isotype information is unavailable. In such cases, the IMWG criteria’s hierarchical structure is often replaced by simultaneous evaluation of all available multiple biochemistry tests for progressive disease.
The ICARIA-MM study protocol specifies that quantified kappa and lambda sFLC were to be measured at each disease assessment, with the ratio of the sFLC also reported. However, a detailed assessment of the content of the ICARIA-MM dataset revealed that while the sFLC ratio was routinely recorded, records of specific kappa or lambda sFLC values were not consistently available. Nevertheless, we acknowledge that quantified sFLC values were collected as part of the ICARIA study, and that serum FLC values are not necessarily required to diagnose IMWG defined disease progression.
Further assessment of the trial dataset demonstrated that the recording of biochemistry tests required for the assessment of disease progression ceased upon discontinuation of the therapy used in the trial. Where PFS2 progression events were recorded after discontinuation of therapy, these were based only on assessment by the investigating clinician and were not accompanied by the underlying test results that may have been used to identify the progression event. This means that assessment of the presence of a subsequent progression event was severely limited using the data recorded for the trial. For example, if radiographic progression data was excluded when constructing real-world progression, any patient who was known to have experienced radiographic progression in the trial would appear to have no progression event but also no subsequent test results. This resulted in large numbers of patients who had progressive disease when IMWG criteria were applied becoming censored before progression could be observed when real-world progression criteria were employed instead.
The results from our assessment indicated that the availability of progression tests at each disease assessment was lower than expected. Additionally, real-world progression criteria using sFLC could not be fully applied to the trial data due to the lack of consistently available absolute sFLC levels. Consequently, the sparsity of recorded tests combined with the incomplete availability of quantified sFLC levels restricts the range of real-world progression criteria that could be applied. Therefore, a comprehensive investigation of the range of scenarios under which real-world ECA analyses can accurately assess relative efficacy could not be performed. Given the limitations when constructing PFS, it was decided that it would not be viable to proceed to a full ECA study.
While we did not fully complete our SAP, the results highlight the challenges of comparing clinical outcomes between real-world and clinical trial data sources. Applying progression criteria commonly used in real-world datasets to clinical trial data revealed limitations in longitudinal data collection, driven by the trial’s targeted scope of data capture, as well as issues with how certain laboratory test results that were not integral to the outcome of the trial were stored. These findings suggest that the most straightforward approach for analyzing PFS across these sources is to use real-world progression criteria for real-world datasets while leaving trial-recorded PFS data unaltered, avoiding attempts to align it with real-world definitions.