Determinants of study design for vaccines against opportunistic bacterial infections a proof-of-concept for data-driven, individual patient data (IPD) analyses.

Lead Investigator: Igor Stojkov, Paul-Ehrlich-Institut
Title of Proposal Research: Determinants of study design for vaccines against opportunistic bacterial infections a proof-of-concept for data-driven, individual patient data (IPD) analyses.
Vivli Data Request: 7849
Funding Source: The COMBINE project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement No 853967. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA companies’ in kind contribution.
Potential Conflicts of Interest: Potential conflict of interest: affiliation to regulatory agency (PEI). The affiliation will be disclosed in the publication.

Summary of the Proposed Research:
Background
In the era of antimicrobial resistance (AMR), there is the need to improve and speed up the drug development process for alternative therapies and preventive approaches to limit the spread of this ‘silent epidemic’. Despite the potential for vaccines to slow down AMR, the majority of the infections listed by World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC) as AMR threats cannot be prevented by a licensed vaccine. Most of these multi-drug resistant pathogens cause opportunistic, hospital-acquired infections (HAI), which are known under the acronyms ESKAPE or ESCAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter). The lack of established correlates (immune markers statistically associated with vaccine-induced protection) and surrogates of protection (correlates on the direct causal pathway between vaccine and protection), which can be used as surrogate endpoints in clinical trials, and the low incidence of the disease, which makes it necessary to include large numbers of participants, represent major problems in the clinical development of vaccines against these pathogens. The COMBINE project, part of the Innovative Medicines Initiative (IMI) AMR Accelerator, aims to contribute to increasing the feasibility to conduct clinical trials for novel vaccines and antibiotics against AMR pathogens by critically assessing and improving the design and analysis of clinical trials.

Aim of the analysis

Individual patient data (IPD) (meta-)analyses have grown increasingly popular in various areas of evidence based medicine. However, they have never been used to investigate clinical trials of vaccines against opportunistic bacterial infections.

In other branches of medicine and biology, data-driven investigations have informed basic research and helped discover previously unknown mechanisms, and identified features of patients likely to develop a disease or respond to treatment, thus driving research in personalized and precision medicine.

We aim to provide a proof-of-concept that the data-driven merged analysis of individual patient data from multiple clinical trials can help understand past failures in the development of vaccines against opportunistic bacterial infection and inform future product development. For the success of this analysis, the availability of data from clinical trials by different sponsors investigating different products is key, as it is crucial to increase variability and to minimize systematic effects associated with a particular product or company operating procedures.

Further clinical data are planned to be analysed as part of the same project. These are data originating from other clinical trials (Phase II/Phase III) for vaccine candidates against S. aureus, C. difficile, P. aeruginosa and extraintestinal pathogenic E. coli (ExPEC). Currently, we are not planning to submit further data requests on Vivli (data cannot be shared or companies do not have agreements with Vivli) but are submitting a request for data from one product outside of the Vivli platform.

Statistical Analysis Plan:

We plan to use supervised (regression models, classification, boosting, random forest) and unsupervised (cluster) statistical techniques to describe the data. The final analysis plan will be defined on the basis of the shared data and in accordance with the contributing data owners. The data will be analyzed using the statistical software R.
Broadly formulated, the planned steps of the analysis are as follows:
1. We will merge and harmonize the data in terms of patient characteristics, efficacy and immunogenicity endpoints (including, if applicable, repeated measurements over time)
2. We will explore the relationships between the immunogenicity and efficacy/clinical endpoints (symptomatic infection, mortality) in an exploratory, data-driven fashion and with respect to patient characteristics across the studies
3. We will explore patient characteristics according to vaccinated/symptomatically infected, vaccinated/not (symptomatically) infected, not vaccinated/symptomatically infected, not vaccinated/not (symptomatically) infected across the studies
4. We will explore immunogenicity endpoints according to vaccinated/symptomatically infected, vaccinated/not (symptomatically) infected, not vaccinated/symptomatically infected, not vaccinated/not (symptomatically) infected across the studies.

We plan to use descriptive, supervised (regression models, classification, boosting, random forest) and unsupervised (cluster) statistical techniques to analyse the data. The final analysis plan will be defined on the basis of the shared data and in accordance with the contributing data owners. The data will be analyzed using the statistical software R.

Independently from the data we will receive, we will first integrate the data from multiple trials. This task requires harmonising the definitions of the relevant variables, i.e. we will ensure that merged variables can be interpreted unequivocally and flag any underlying systematic difference. We will perform this task with the help of colleagues with clinical and biological expertise. This step of the analysis is considered preparatory to the statistical analysis.

Secondly, we will analyse the data according to specific subtopics described below (if not otherwise specified, the analyses will be carried out using descriptive statistics and confidence intervals):
1. (Definition of VE) Whenever data about the disease are available (occurrence, time to event, clinical definition), we will calculate and compare different defintions of vaccine efficacy (VE) (Halloran et al., Design and Analysis of Vaccine Studies, 2010; estimands framework described in ICH E9(R1)). We will compare different measurements of VE within one trial and, whenever possible, across different trials/products for the same indication. This analysis will inform the choice of VE definition and analysis for future trials within the given indication.
2. (Candidate correlates of protection) We will explore and model the relationship between immunogenicity endpoints and disease onset/severity using graphical methods (scatterplot/boxplots), knowledge-driven models (i.e. regression with adjustment for prespecified covariates) and data-driven approaches (e.g. boosting, random forest or elastic net using a larger set of covariates potentially linked to the outcome variable for biological or technical reasons); depending on the number of infections in the trials, methods for sparse data might be used. This analysis will suggest future correlates of protection to be investigated and validated in future studies.
3. (Subgroup analyses, knowledge-driven) On trial-level, product-level and indication-level, we will define subgroups based on baseline demographic and clinical characteristics (including age, sex/gender, country, relevant comorbidities, baseline antibiotics usage) and describe the infection rate/immunogenicity parameters per subgroup. We will compare whether similar populations show similar values of immunogenicity, infections and VE across the trials. If groups are large enough, will repeat analyses 1 and 2 in the defined subgroups. This analysis will describe whether groups defined on baseline parameters have the potential to show marked differences in the study outcome, and whether population-specific VE definition and correlates of protection can be proposed.
4. (Subgroup analyses, data-driven) On trial-level, product-level and indication-level, we will use unsupervised techniques (e.g. clustering) to define populations with similar immunogenicity outcomes. We will describe their demographic and clinical characteristics and compare them to the subgroups studied under point 3. We will compare similarities and differences of the resulting subgroups across trials and products to ascertain or confute the robustness of our findings. If groups are large enough, will repeat analyses 1 and 2 in the defined subgroups. This analysis, compared to the results of point 3, will give further perspectives regarding inclusion and exclusion criteria in future trials.
Other analyses, e.g. of secondary efficacy endpoints, might be performed to clarify and further characterise the findings.

Requested Studies:

Efficacy, Immunogenicity, and Safety Study of Clostridium Difficile Toxoid Vaccine in Subjects at Risk for C. Difficile Infection (Cdiffense™)
Data Contributor: Sanofi
Study ID: NCT01887912
Sponsor ID: H-030-014

A Phase II Randomized, Placebo-Controlled, Double-Blind, Dose Ranging Study of A Clostridium Difficile Toxoid Vaccine (ACAM-CDIFF™) in Subjects With Clostridium Difficile Infection (CDI)
Data Contributor: Sanofi
Study ID: NCT00772343
Sponsor ID: H-030-011

Safety and Immunogenicity of Different Formulations of a Clostridium Difficile Toxoid Vaccine Administered at Three Different Schedules in Adults Aged 40 to 75 Years at Risk of C. Difficile Infection
Data Contributor: Sanofi
Study ID: NCT01230957
Sponsor ID: H-030-012

Public Disclosure:

Stojkov, I., Hofner B. Exploring the Association Between Study Characteristics and Post-Vaccine Immunogenicity for C. diff: Data-Driven Analyses of Two Vaccine Trials. AMR Conference. 2025.