Predicting Individuals’ Probability of Response to Smoking Cessation Pharmacotherapy

Lead Investigator: Rachel Tomko, Medical University of South Carolina
Title of Proposal Research: Predicting Individuals’ Probability of Response to Smoking Cessation Pharmacotherapy
Vivli Data Request: 4915
Funding Source: None.
Potential Conflicts of Interest: Dr. Tomko is funded through National Institutes of Health grants (as listed in her attached biosketch). She has no other potential conflicts of interest.

Summary of the Proposed Research:

Approximately 16% of adults in the United States are current tobacco smokers and the health effects are extremely costly. Though several Food and Drug Administration (FDA)-approved smoking cessation pharmacotherapies exist [e.g., varenicline, bupropion, nicotine replacement therapy (NRT)], utilization rates remain low and a substantial portion of smokers do not respond to existing treatments. Fewer than half of smokers who make a quit attempt use an evidence-based medication. The most effective single medication to-date, varenicline, results in abstinence at treatment completion for approximately 35-50% of smokers, but only 14% of smokers who make a quit attempt use a prescription medication like varenicline.

The gold standard method (randomized controlled trial; RCT) for evaluating efficacy of smoking cessation medications provides information about the effect of the medication at the population-level. However, traditional RCTs are not able to provide an individual smoker with an estimate of their probable treatment response. Our goal is to develop and test an algorithm, based on demographic and clinical data assessed prior to treatment, to estimate individual smokers’ likely response to FDA-approved pharmacotherapies for smoking cessation, including varenicline, bupropion, and nicotine replacement therapy (NRT). This will aid in treatment recommendations for smokers and potentially increase medication-aided smoking cessation attempts.

Statistical Analysis Plan:

We propose to use the data collected in the EAGLES trial to build a statistical model to estimate individual-specific probabilities of treatment response for each medication. We are requesting access to baseline and demographic information, treatment assignment, adverse events, and participant response for all participants in the EAGLES trial. The aim of this study is to predict probability of patient response to different smoking cessation treatments to allow physicians to present a personalized recommendation for smoking cessation pharmacotherapy. We define the recommended treatment option as the treatment for which the model predicts the highest probability of success from among the treatments.

Model Development: Prior to model development we will impute any missing values for predictor variables using the rfImpute function in the randomForest package. This approach initializes all missing values as the median for each variable and then iterates between building a random forest model on the completed data and updating all missing observations using a weighted average of non-missing observations estimated the proximity matrix from the random forest model fit at each iteration.

The overarching aim is to develop a multivariable prediction model of treatment response using baseline participant characteristics, probability of side effects for each treatment, and treatment type using a 2-step modeling approach. In the first step, we will build a model to estimate the probability of an adverse event for each treatment based on patient baseline characteristics. In the second step, we will develop a model to predict the probability of successful patient response to each treatment using baseline characteristics, treatment received, and the probability of an adverse event for each active treatment estimated from the model in Step 1. Given that no one statistical modelling approach consistently provides the best prediction performance across varied data sets, we will develop and evaluate prediction models using multiple statistical and machine learning modeling approaches. Multivariable classification models considered in this study will included logistic regression (for comparison), Classification and Regression Trees (CART), random forest (RF), support vector machines with linear, polynomial, and Gaussian kernels (SVML, SVMP, and SVMR respectively), naive Bayes (NB), and artificial neural networks (ANN). All the proposed machine learning approaches require tuning prior to model fitting. Tuning parameters for the different models considered will be selected using cross-validation using the train function in the ‘caret’ package in R. Variable selection and evaluation of model performance will be assessed using 10-fold cross validation and for all models will be conducted using recursive backwards selection to identify the set of variables yielding the best prediction performance as measured by the 10-fold CV area under the receiver operating characteristics curve. For multivariable models in steps 1 and 2 we will consider models with 1) all demographic and baseline patient characteristics and 2) a reduced set of variables selected using a variable selection approach.

Requested Studies:

Study Evaluating The Safety And Efficacy Of Varenicline and Bupropion For Smoking Cessation In Subjects With And Without A History Of Psychiatric Disorders (EAGLES)
Sponsor: Pfizer
Study ID: NCT01456936

Public Disclosures:

Wolf, B.J., Gray, K.M., Dahne, J.R., Hashemi, D. and Tomko, R.L., 2024. Can We Predict Who Will Experience Adverse Events While Using Smoking Cessation Pharmacotherapy? A Secondary Analysis of the EAGLES Clinical Trial. Nicotine and Tobacco Research, p.ntae290. Doi : 10.1093/ntr/ntae290