Data-sharing and re-analysis for main studies assessed by the European Medicines Agency – a cross-sectional study on European Public Assessment Reports

Lead Investigator: Florian Naudet, CHU Rennes
Title of Proposal Research: Data-sharing and re-analysis for main studies assessed by the European Medicines Agency – a cross-sectional study on European Public Assessment Reports
Vivli Data Request: 5853
Funding Source: Project funded by the French National Research Agency; Grant number ANR-17-CE-36-0010-01
Potential Conflicts of Interest: FN received a grant from the Agence Nationale de la Recherche
Grant number ANR-17-CE-36-0010-01

Summary of the Proposed Research:

The influence of main trials (i.e. evidence used for drug marketing approval) as assessed by the European Medicine Agency (EMA) is paramount. These studies have a major impact on drug Marketing Authorizations and can change the practices of European medical practitioners and the care offered to millions of patients in the European Union. Because of the major financial conflicts of interest inherent in the evaluation of pharmaceuticals, stakeholders are typically more confident when the results and conclusions of these studies can be verified. For a long time, however, transparency was lacking and the individual patient data (IPD) and accompanying material (code, protocol, data analysis plan, etc.) to reproduce these analyses were unavailable. An empirical analysis suggests that only a small number of re-analyses of randomized controlled trials (RCTs) have been published to date; of these, only a minority were conducted by entirely independent authors. Data-sharing enabling such re-analyses is being increasingly mandated in medicine.

And indeed, the EMA aimed to pioneer transparency in this field when, in November 2010, it decided to share every piece of documentation received, in the wake of the first version of policy 0043. As part of its transparency policy, the EMA publishes European Public Assessment Reports (EPAR) after the European Commission’s decision on the specific medicines. These reports include, amongst other documents, results of main trials. On October 2nd 2014 the EMA released its policy 0070 on “publication of clinical data for medicinal products for human use”. The agency describes a two-step approach. From 1st of January 2015 clinical reports on medicines submitted for Marketing Authorization have been published. A second step includes the publication of IPD. A date for the implementation of this step still needs to be fixed. However, as a result of Brexit and the relocation of the EMA to the Netherlands, further developments and renovation have been stopped for the moment. Efforts are therefore still needed to reach full transparency in the EMA.

On the other hand, biopharmaceutical companies (i.e. Pharmaceutical Research and Manufacturers of America [PhRMA] and the European Federation of Pharmaceutical Industries and Associations [EFPIA] endorsed a commitment ‘to enhancing public health through responsible sharing of clinical trial data’ in a manner that is consistent with 3 main principles: safeguarding the privacy of patients, respecting the integrity of national regulatory systems, and maintaining incentives for investment in biomedical research. Despite this commitment from 2013, an audit found that data availability was reached for only 9/61 (15%) clinical trials on medicines sponsored by the pharmaceutical industry and first published between 1 July 2015 and 31 December 2015 in the top 10 journals of general and internal medicine. If such low rates of data-sharing were also to be observed for main trials, it would invalidate any efforts towards reproducibility for these important studies.

However, the environment for data-sharing is changing fast. And indeed, data-sharing Platforms like ViVli, YODA project or Clinical Study Data Request are more and more widely used. In fall of 2019 these platforms gathered a large number of trials sponsored by the pharmaceutical industry. The three together reached about 8000 RCTs in November 2019. Despite this available data, re-analyses are still sparse. Among the 88 published outputs we identified resulting from data-sharing on these platforms, only 3 were re-analyses: “Restoring Study 329” by Le Noury et al. which contradicted the initial publication, a trial that was already known to be misreported, a reanalysis of the TORCH trial suggesting an overestimation of the treatment effect in the original study, and the reanalysis of the “SMART-AF” trial which came to similar conclusions to the original study.

As part of a global research program on reproducibility in therapeutic research (ReiTheR, funded by the French National Research Agency), we designed the present cross-sectional study to assess inferential reproducibility (i.e. when IPD is available, whether qualitatively similar conclusions can be drawn from a reanalysis of the original trials) for main studies assessed by the EMA.

Up to now, there has been no clear picture of the inferential reproducibility of the main studies in European Assessment Reports. These main studies are used to assess the efficacy of new medical products which will influence the lives of millions of people in the European Union. If data availability were to prove low it should urge the EMA to implement an even stronger data-sharing policy. If re-analyses of available data show low reproducibility, it would argue for independent re-analyses at the time of the approval. On the other hand, if there were to be no issues in terms of reproducibility, it would reinforce the confidence one can have in EMA’s transparency concerning processes and decisions.

In 2019, the EU Clinical Trial Regulation 536/2014 will come into force and will further extend the boundaries of data-sharing and transparency in the EMA as well as within the European Union. The EU Portal and Data Base is being developed, creating a single-entry point for submitting clinical trials in the EU. A further advantage is that not only information about clinical trials included in Marketing Authorization Applications (MAA) can be found on the portal, but the aim is to have data about every single trial conducted in the European Union, whether or not it is part of an MAA. Our research could help to highlight the interest of the future regulation.

To further enhance the impact of this project, the reproducibility of our results will be checked by comparison with another reproducibility project in the ReiTheR project.

Statistical Analysis Plan:

Eligible main trials:
Two reviewers (MS, JG) will manually extract all names of the new medicines, biosimilars and orphan medicines approved by the CHMP and enter the information on an Excel Sheet. Afterwards, a check will be performed to verify that the CHMP opinion was adopted by the European Commission. Next, the reviewers will identify the corresponding eligible EPARs on the EMA website and will extract all main studies reported in these EPARs. In case of disagreement, a third reviewer (CL or FN) will arbitrate.

Sample size calculation:
A random sample of 62 of these main studies will be selected using R (rnorm function). This sample size will ensure a precision of ± 12% to estimate our primary outcome (i.e. percentage of reproducible studies, see below for a definition) in the worst-case scenario for precision estimations (i.e. if the percentage is 50%).

Main study document accessibility:
For all selected main studies, one reviewer (JG) will search for the EudraCT number and/or the Sponsor Protocol Number, and/or any other identification information in each EPAR, and will identify the official Sponsor of the study. If this information is lacking, the same reviewer will start a wildcard search using keywords (disease, drug) from the study in the European Union (EU) Clinical Trial Register. If this is not successful, the reviewer will go on the websites, International Clinical Trials Registry Portal (ICTRP), World Health Organization and the International Standard Randomised Controlled Trial Number (ISRCTN) allocated by BioMedCentral. If information on sponsor and study number is still lacking the reviewer will contact the EMA.

Once the sponsor and the study number are identified, the reviewer will contact the sponsor to collect all of the following main study documents: 1) IPD, 2) data analysis plan, 3) unpublished and/or published study protocols with any date-stamped amendments 4) all the following dates: date of the last visit of the last patient, date of database lock (if available) and date of study unblinding, 5) unpublished and/or published (scientific article) study reports.

To this end the reviewer will send a standardised email, presenting the research project with a link to the pre-registered protocol on the Open Science Framework. In order to improve the return rate, up to 4 emails will be sent, the original and 3 reminder emails (with a two-week interval between e-mails).
If we are asked for this information, we will indicate that the Data-Sharing of raw data is welcome in form of Study Data Tabulation Model (SDTM) which was created by the Clinical Data International Standard Consortium (CDISC).

In some cases, it will be sufficient to contact the sponsor by e-mail, in other cases the sponsor will ask us to retrieve the data on a web portal. In this case we will have to use the platform.
In parallel the same reviewer will search for these documents on the EMA portal (26) and by inspecting the published reports (if available) identified using open trial. This process is summarised in Figure 1.

Data Extraction:
The identification of main studies and the following trial characteristics will be extracted on an Excel spreadsheet by two independent researchers (JG and FN or CL).
These characteristics include patient characteristics (e.g. percentage of women, mean age of participants, pediatric indication), study design (e.g. endpoint type, description for each primary endpoint) and intervention characteristics (e.g. drug). All are described in the supplementary material.

In case of disagreement, a third independent reviewer (FN or CL) will arbitrate. An exhaustive list of the trial characteristics extracted can be found in the additional file 3. The data extraction sheet will be pilot-tested on 10 studies before being validated.

Concerning the re-analysis, a first reviewer (JG, PhD Student) will collect the information and collate data for the reanalysis. More specifically, the reviewer will prepare a dossier with the following information for each study: 1/ the protocol, 2/ all amendments to the protocol (with their dates), 3/ all the following dates: date of the last visit of the last patient, date of database lock (if available) and date of study unblinding, and 4/ the IPD. In case of information still lacking, the study authors will be contacted.

Strategy for re-analyses:
Should the IPD not be available one year after our initial request, the study will be considered as non-reproducible (primary outcome of our study).

On the basis of the dossier prepared by the first researcher, re-analyses of the primary outcome(s) of each study will be performed by a second researcher (MS, PhD student) who will have no access to study reports, journal publications, statistical analysis plan, or analytical code, in order to ensure that the analysis is as blind as possible to the primary analysis. In addition, this reviewer will be instructed not to try to find these documents or the published report.

For single-blind studies or open-label studies, analyses will be performed according to the first version of the protocol, because outcome switching is possible in these studies. For double-blind studies, all re-analyses will be based on the latest version of the protocol issued before database lock and unblinding. If this information is not available, the date of the last visit of the last patient will be used as a proxy. In case of missing information for these dates, the study authors will be contacted.

Although in therapeutic research statistical analysis is fairly simple, in some cases the re-analyses can involve difficult methodological choices. An independent senior statistician (AR) will be available to discuss any difficult aspect or choice in the analysis plan before the re-analysis, so as to choose the most consensual analyses (e.g. Intention to Treat population for a superiority trial).

Should insufficient information concerning the main analysis be provided in the protocol, the best practices for clinical research will be used, following the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH Guidelines).

An analysis plan will be developed for each study included and will be recorded on the Open Science Framework.

In the supplementary material a table is provided with details of what will be taken from the ICH guidelines in case of missing information. Re-analyses will entail the following different steps: 1/ identification of the primary outcome (and detection of outcome switching), 2/ definition of the study population, 3/ re-analysis of the primary outcome. Any change identified between the first version of the protocol and the version used for the re-analysis of the primary outcome will be tracked and described.

Data Analysis:
We will perform a descriptive analysis of the characteristics of the extracted main studies included in the EPARs selected. This will include counts, percentages and their associated 95% confidence intervals (CIs).

Effect estimates in the different studies will be expressed as standardised mean differences (SMDs) and their associated 95% confidence intervals. For binary outcomes, odds ratios and their 95% CIs will be calculated and converted into the standardised mean difference.

In order to compare the results of our re-analyses with the original results, the following steps will be implemented: 1/ We will compare the statistical significance in the form of the p-value. If different, the result will be considered as not reproducible. If not different, 2/ we will qualitatively compare effect sizes and their respective 95% CIs. In case of +/- 0.10 points difference in point estimates (expressed as standardised mean differences), the difference will be discussed with a clinician in order to assess its clinical significance.

All analyses will be performed using the open source statistical software R (R Development Core Team). The code will be made public on the Open Science Framework, as well as a file summarizing the process to retrieve all data-sets.

Requested Studies:

A Single-Arm, Open-Label Study to Evaluate the Efficacy and Safety of ABT-493/ABT-530 in Adults With Chronic Hepatitis C Virus Genotype 4, 5, or 6 Infection (ENDURANCE-4)
Data Contributor: AbbVie
Study ID: NCT02636595
Sponsor ID: M13-583

Public Disclosure:

Siebert, M., Gaba, J., Renault, A. et al. Data-sharing and re-analysis for main studies assessed by the European Medicines Agency—a cross-sectional study on European Public Assessment Reports. BMC Med 20, 177 (2022). doi:10.1186/s12916-022-02377-2