Early prediction of survival outcome of large-scale randomized clinical trials using artificial intelligence technology

Lead Investigator: Keiichi Fujiwara, Gynecologic Oncology Trial and Investigation Consortium (GOTIC)
Title of Proposal Research: Early prediction of survival outcome of large-scale randomized clinical trials using artificial intelligence technology.
Vivli Data Request: 7224
Funding Source: None
Potential Conflicts of Interest: None

Summary of the Proposed Research:

In a large-scale randomized phase 3 trials, an interim analysis that assesses the survival rate at a certain time during the trial is usually performed to evaluate for early efficacy, futility, or both with using dead and alive patient data that is known as uncensored and censored data, respectively. Each data distribution of the observed time of the uncensored and the censored data is not considered quantitatively in the interim analysis. There are adjustments that need to be made to counter inflation of statistical error and loss of power, or both depending on the focus of the interim analysis. In addition, prespecified decision boundaries can lead to inappropriate conclusions (in either direction: ‘trial success’ or ‘trial failure). Further, the decision of either “go” or “no-go” for the ongoing study is dependent on a single statistic. The single number does not include confidence intervals resulting in a lack of reliability.

The Necessity of the Research;
How many patients/members of the public are potentially affected;

If the lacking data in the large-scale phase 3 trials can be reliably predicted as early as possible, human and financial resources and time will be saved. Therefore, the feasibility of this method is deserving of investigation.

How the research will add to medical science or patient care;

If it becomes possible to accurately predict the results of large-scale comparative studies with a smaller sample size, it will be possible to reduce the number of patients involved in experimental treatments. At the same time, it can be applied to comparative studies of rare diseases.

How the research will be conducted;
We would like to obtain and analyze survival data from five randomized controlled trials in cancer from Vivli. We want to see how fast and how accurately our artificial intelligence (AI)-based method can predict survival in trials with different diseases and different patterns of survival results.
Specifically, for each trial, we will input the patient survival data into the AI program in the order of enrollment and determine the number of cases that are similar to the original survival analysis results.
After that, we will continue to input the data up to the last patient, and in the process, we will verify if there are any fluctuations in the analysis results of the AI program.

What design and methods you have chosen and why (in brief);
The method used here is completely original.
We have developed and published a new method for the prediction of clinical trial results using compressive sensing of artificial intelligence (AI).
This new method can generate a list of predicted data mimicking a completed clinical trial, and obtained from a smaller number of known data.
Unlike the contemporary interim analysis algorithms, this method.
1) uses AI that can classify data distributions of not only the known uncensored data but also the known censored data respectively,
2) can generate predicting time data by compressive sensing for both uncensored and censored data
3) does not assume any distribution regarding with time,
4) can generate hundreds of P-values calculated by a surviving test such as a log-rank test resulting in robust conclusion presumably, and
5) if validated, would be able to predict the outcome of survival as the form of P-values distribution profile at any time on the way of the study.
The boundary value of statistical error as a function of information fraction is designed by using the Lan-DeMets spending function to approximate an O’Brien-Fleming  bound. Power is designed for 0.8

Statistical Analysis Plan:

Each study is investigated independently. We expect that Kaplan-Meier curves with log-rank P-values are available from the published papers. Deidentified data of date of randomization, 1st event for progression, 2nd event for death, and final status in the form of date or not available for each of all of the patients will be sent to the investigators. The data, including missing values, are excluded, and only the complete data are used. The predictions of progression-free survival and overall survival for each data set are investigated independently. The appropriate sample sizes for arm A and arm B are already known. The censored and uncensored time event data set can be a function of arbitrary date from the randomization to the final event for each arm. Before generating the final full data set, assuming the distribution of censored/uncensored ratios follows a normal distribution, the number of final uncensored and censored cases with 5, 25, 50, 75, and 95 percentiles of raw data set or mean, 75 %, and 95 % confidence intervals of a regression function are predicted by using data set obtained by the determined date. Then, the data distribution profile of the dataset according to the arbitrary date is determined by artificial intelligence for each data in an arm independently. The compressive sensing method using the distribution profile will be applied to supplement a shortage data set and predict the final data set for each arm. Then a single P-value of log-rank test for a single data set of arms A and B will be obtained. Applying the all predicted data set, the P-value distribution profile as results of log-rank tests. Because the P-value distribution profile is a function of the determined date, the preferred minimum date for predicting the actual P-value can be investigated and found out. Finally, this method is compared by alpha spending as a function of information fraction is designed by using the Lan-DeMets spending function to approximate an O’Brien-Fleming bound.

Requested Studies:

AURELIA: A Multi-center, Open-label, Randomised, Two-arm Phase III Trial of the Effect on Progression Free Survival of Bevacizumab Plus Chemotherapy Versus Chemotherapy Alone in Patients With Platinum-resistant, Epithelial Ovarian, Fallopian Tube or Primary Peritoneal Cancer
Data Contributor: Roche
Study ID: NCT00976911
Sponsor ID: MO22224

A Phase III, Multicenter, Randomized, Blinded, Placebo-controlled Trial of Carboplatin and Gemcitabine Plus Bevacizumab in Patients With Platinum-sensitive Recurrent Ovary, Primary Peritoneal, or Fallopian Tube Carcinoma
Data Contributor: Roche
Study ID: NCT00434642
Sponsor ID: AVF4095g

A Double-blind, Randomised, Placebo Controlled Phase III Study of Nintedanib Plus Best Supportive Care (BSC) Versus Placebo Plus BSC in Patients With Metastatic Colorectal Cancer Refractory to Standard Therapies.
Data Contributor: Boehringer Ingelheim
Study ID: NCT02149108
Sponsor ID: 1199.52

A Randomized, Double-blind, Multicenter Phase 3 Study of Irinotecan, Folinic Acid, and 5-Fluorouracil (FOLFIRI) Plus Ramucirumab or Placebo in Patients With Metastatic Colorectal Carcinoma Progressive During or Following First-Line Combination Therapy With Bevacizumab, Oxaliplatin, and a Fluoropyrimidine
Data Contributor: Lilly
Study ID: NCT01183780
Sponsor ID: 13856

A Randomized, Open-label Phase III Intergroup Study: Effect of Adding Bevacizumab to Cross Over Fluoropyrimidine Based Chemotherapy (CTx) in Patients With Metastatic Colorectal Cancer and Disease Progression Under First-line Standard CTx/Bevacizumab Combination
Data Contributor: Roche
Study ID: NCT00700102
Sponsor ID: ML18147

Public Disclosures:

Miyagi, Y., Keiichi Fujiwara, Hisanaga Nomura, Koji Yamamoto, & Robert L Coleman. (2023). Feasibility of New Method for the Prediction of Clinical Trial Results Using Compressive Sensing of Artificial Intelligence. British Journal of Healthcare and Medical Research, 10(1), 237–267. doi: 10.14738/bjhmr.101.14061