Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-21T13:24:02.943Z Has data issue: false hasContentIssue false

Moving toward precision PTSD treatment: predicting veterans' intensive PTSD treatment response using continuously updating machine learning models

Published online by Cambridge University Press:  19 October 2022

Dale L. Smith*
Affiliation:
Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, 325 S. Paulina St., Suite 200, Chicago, IL 60612, USA Behavioral Sciences, Olivet Nazarene University, 1 University Ave., Bourbonnais, Illinois 60914, USA
Philip Held
Affiliation:
Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, 325 S. Paulina St., Suite 200, Chicago, IL 60612, USA
*
Author for correspondence: Dale L. Smith, E-mail: Dale_Smith@rush.edu
Rights & Permissions [Opens in a new window]

Abstract

Background

Considerable heterogeneity exists in treatment response to first-line posttraumatic stress disorder (PTSD) treatments, such as Cognitive Processing Therapy (CPT). Relatively little is known about the timing of when during a course of care the treatment response becomes apparent. Novel machine learning methods, especially continuously updating prediction models, have the potential to address these gaps in our understanding of response and optimize PTSD treatment.

Methods

Using data from a 3-week (n = 362) CPT-based intensive PTSD treatment program (ITP), we explored three methods for generating continuously updating prediction models to predict endpoint PTSD severity. These included Mixed Effects Bayesian Additive Regression Trees (MixedBART), Mixed Effects Random Forest (MERF) machine learning models, and Linear Mixed Effects models (LMM). Models used baseline and self-reported PTSD symptom severity data collected every other day during treatment. We then validated our findings by examining model performances in a separate, equally established, 2-week CPT-based ITP (n = 108).

Results

Results across approaches were very similar and indicated modest prediction accuracy at baseline (R2 ~ 0.18), with increasing accuracy of predictions of final PTSD severity across program timepoints (e.g. mid-program R2 ~ 0.62). Similar findings were obtained when the models were applied to the 2-week ITP. Neither the MERF nor the MixedBART machine learning approach outperformed LMM prediction, though benefits of each may differ based on the application.

Conclusions

Utilizing continuously updating models in PTSD treatments may be beneficial for clinicians in determining whether an individual is responding, and when this determination can be made.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press

Introduction

Mounting evidence supports the efficacy of Cognitive Processing Therapy (CPT; Resick, Monson, & Chard, Reference Resick, Monson and Chard2017a), which is considered a first line intervention for treating posttraumatic stress disorder (PTSD; APA, 2017; ISTSS, 2017; VA/DoD, 2017). Support comes from randomized controlled trials (Monson et al., Reference Monson, Schnurr, Resick, Friedman, Young-Xu and Stevens2006; Resick, Nishith, Weaver, Astin, & Feuer, Reference Resick, Nishith, Weaver, Astin and Feuer2002; Resick et al., Reference Resick, Uhlmansiek, Clum, Galovski, Scher and Young-Xu2008, Reference Resick, Wachen, Mintz, Young-McCaughan, Roache, Borah and Peterson2015, Reference Resick, Wachen, Dondanville, Pruiksma, Yarvis, Peterson and Young-McCaughan2017b) as well as clinical research (Asmundson et al., Reference Asmundson, Thorisdottir, Roden-Foreman, Baird, Witcraft, Stein and Powers2019; Held, Smith, Pridgen, Coleman, & Klassen, Reference Held, Smith, Pridgen, Coleman and Klassen2022c; Lloyd et al., Reference Lloyd, Couineau, Hawkins, Kartal, Nixon, Perry and Forbes2015). CPT has been successfully delivered in different formats such as the traditional 12 sessions delivered on a weekly basis (Monson et al., Reference Monson, Schnurr, Resick, Friedman, Young-Xu and Stevens2006; Resick et al., Reference Resick, Nishith, Weaver, Astin and Feuer2002, Reference Resick, Uhlmansiek, Clum, Galovski, Scher and Young-Xu2008, Reference Resick, Wachen, Mintz, Young-McCaughan, Roache, Borah and Peterson2015, Reference Resick, Wachen, Dondanville, Pruiksma, Yarvis, Peterson and Young-McCaughan2017b) and massed/intensive treatments which deliver a full course of treatment in as little as one to three weeks (Galovski et al., Reference Galovski, Werner, Weaver, Morris, Dondanville, Nanney and Iverson2021; Held et al., Reference Held, Kovacevic, Petrey, Meade, Pridgen, Montes and Karnik2022a, Reference Held, Smith, Pridgen, Coleman and Klassen2022c). Effect sizes for PTSD severity reduction in CPT are generally large and meaningful when delivered weekly or in massed format (e.g. d > 1.0; Asmundson et al. Reference Asmundson, Thorisdottir, Roden-Foreman, Baird, Witcraft, Stein and Powers2019; Held, Bagley, Klassen, & Pollack, Reference Held, Bagley, Klassen and Pollack2019) and have been demonstrated to persist after treatment for up to ten years following treatment completion (Held et al., Reference Held, Zalta, Smith, Bagley, Steigerwald, Boley and Pollack2020b; Resick, Williams, Suvak, Monson, & Gradus, Reference Resick, Williams, Suvak, Monson and Gradus2012). However, not all participants benefit to the same extent (Dewar, Paradis, & Fortin, Reference Dewar, Paradis and Fortin2020). Recent research on massed CPT delivered as part of an intensive PTSD treatment program (ITP) identified four separate PTSD response trajectories (Held et al., Reference Held, Smith, Bagley, Kovacevic, Steigerwald, Van Horn and Karnik2021). In line with other research examining response trajectories in weekly CPT (Galovski et al., Reference Galovski, Harik, Blain, Farmer, Turner and Houle2016; Schumm, Walter, & Chard, Reference Schumm, Walter and Chard2013), approximately 15% reached treatment goals within a small number of sessions and 14% didn't respond to treatment in any meaningful way (Held et al., Reference Held, Smith, Bagley, Kovacevic, Steigerwald, Van Horn and Karnik2021). Given this variability in treatment response across treatment programs for psychiatric conditions, development of prediction models for determining who is, or is likely to be, benefitting from treatment is paramount.

The emerging emphasis on machine learning in developing prediction models in psychological medicine, as well as the increase in the types and amount of data collected in the field, has led to increased use of these methods for various applications, including tracking treatment response (Shatte et al., Reference Shatte, Hutchinson and Teague2019). Such approaches often differ from traditional statistical approaches in their emphasis on prediction accuracy rather than probabilistic emphasis on specific predictors and aspects of their relationships with outcomes (e.g. slopes or odds ratios). Machine learning models are able to accommodate a larger number of variables as predictors than generally found in traditional statistical approaches. Although some baseline predictors, such as baseline PTSD severity or negative posttraumatic cognitions, have been shown to be useful in predicting such non-responders, the amount of variability in post-treatment PTSD and depression severity that can be accounted for solely via baseline assessment is usually limited (Held et al., Reference Held, Smith, Bagley, Kovacevic, Steigerwald, Van Horn and Karnik2021, Reference Held, Schubert, Pridgen, Kovacevic, Montes, Christ and Smith2022b; Hilbert et al., Reference Hilbert, Kunas, Lueken, Kathmann, Fydrich and Fehm2020; Nixon et al., Reference Nixon, King, Smith, Gradus, Resick and Galovski2021).

Primarily focusing on baseline predictors may be important for initial determination of the appropriateness of a treatment program for an individual (Held et al., Reference Held, Smith, Bagley, Kovacevic, Steigerwald, Van Horn and Karnik2021; Hilbert et al., Reference Hilbert, Kunas, Lueken, Kathmann, Fydrich and Fehm2020; Nixon et al., Reference Nixon, King, Smith, Gradus, Resick and Galovski2021), however such models also involve considerable uncertainty given the dynamic nature of treatment response over time. The recent emphasis on implementation of precision medicine approaches (Aafjes-van Doorn, Kamsteeg, Bate, & Aafjes, Reference Aafjes-van Doorn, Kamsteeg, Bate and Aafjes2021; Chekroud et al., Reference Chekroud, Bondar, Delgadillo, Doherty, Wasil, Fokkema and Choi2021; Delgadillo, Reference Delgadillo2021; Hilbert et al., Reference Hilbert, Kunas, Lueken, Kathmann, Fydrich and Fehm2020) necessitates identification of participants who may or may not be responding to treatment as early as possible. Recently developed machine learning approaches that account for the longitudinal structure of repeated assessments hold promise for improved accuracy in predicting participants' treatment response by continuously updating models with newly acquired information about a patient's treatment response (e.g. repeatedly measured symptom severity scores). The ability to assess individual progress during treatment and update predictions of patient's response is likely a necessary precursor to treatment adjustments in any precision medicine approach.

Although others have attempted clinical prediction models in PTSD outcomes during the course of treatment (Held et al., Reference Held, Schubert, Pridgen, Kovacevic, Montes, Christ and Smith2022b; Nixon et al., Reference Nixon, King, Smith, Gradus, Resick and Galovski2021), these studies have not utilized approaches designed to accommodate the correlated structure inherent to longitudinal data, in which observations are nested within individuals, or have predicted variants of categorized non-response rather than overall PTSD severity. Given the lack of a generally agreed-upon standards for what may constitute non-response to PTSD treatment (Varker et al., Reference Varker, Kartal, Watson, Freijah, O'Donnell, Forbes and Hinton2020), and in the interest of modeling the full spectrum of variability in treatment response, predicting continuous PTSD severity may be a preferred solution.

The current study aimed to examine the ability for machine learning and statistical prediction models to utilize both baseline data and updated PTSD symptom severity information throughout the program to generate increasingly accurate and informative predictions of post-treatment PTSD severity for participants in a 3-week CPT-based ITP. This was evaluated using three approaches; Mixed Effect Random Forest (MERF; Hajjem, Bellavance, & Larocque, Reference Hajjem, Bellavance and Larocque2011, Reference Hajjem, Bellavance and Larocque2014) and Mixed Effects Bayesian Additive Regression Trees (MixedBART; Spanbauer & Sparapani, Reference Spanbauer and Sparapani2021), which both appropriately model random effects, and gold-standard statistical linear mixed-effects longitudinal models (LMMs) were used to generate these updating predictions. As shown previously (Held et al., Reference Held, Schubert, Pridgen, Kovacevic, Montes, Christ and Smith2022b), we expected that models would provide acceptable performance with baseline predictors, but that accuracy would improve throughout the program with the incorporation of updated PTSD severity information as treatment progressed and change trajectories became more apparent. Testing continuously improving models could provide foundational information in implementing a precision medicine-based approach in PTSD treatment. We were generally agnostic regarding the ability for machine learning to outperform mixed-effects regression predictions, given prior research demonstrating that machine learning approaches may not necessary outperform standard statistical approaches in making clinical predictions (Cho et al., Reference Cho, Austin, Ross, Abdel-Qadir, Chicco, Tomlinson and Lee2021; Christodoulou et al., Reference Christodoulou, Ma, Collins, Steyerberg, Verbakel and Van Calster2019; Li et al., Reference Li, Zhou, Dong, Fu, Li, Luan and Peng2021).

Methods

Participants

Data utilized in this study were from 361 veterans with PTSD who completed a 3-week CPT-based ITP at Rush University Medical Center's Road Home Program: Center for Veterans and Their Families. Participants were included if they had complete dataFootnote Footnote 1. On average, veterans in the sample were 41.46 years old (s.d. = 9.43). The majority identified as male (63.71%) and White, (67.87%). Additional sample characteristics can be found in Table 1.

Table 1. Demographic characteristics

a χ2 or t test comparisons indicated that significant differences exist between the two programs in sex, race, service era, MST status, and PCL-5 at baseline (ps < 0.05).

b PCL-5 = PTSD Checklist for DSM-5.

Program description

During the 3-week ITP, veterans received 14 individual CPT sessions, 13 group CPT sessions, 13 group mindfulness sessions, and 12 group yoga sessions in addition to psychoeducation classes on various topics, such as sleep hygiene. A more detailed description of the ITP and its outcomes can be found in elsewhere (Held et al., Reference Held, Klassen, Boley, Wiltsey Stirman, Smith, Brennan and Zalta2020a; Zalta et al., Reference Zalta, Held, Smith, Klassen, Lofgreen, Normand and Karnik2018). Veterans were eligible for the ITP if they met the diagnostic criteria for PTSD, which was verified using the Clinician-Administered PTSD Scale for DSM-5 (CAPS-5; Blevins, Weathers, Davis, Witte, & Domino, Reference Blevins, Weathers, Davis, Witte and Domino2015; Bovin et al., Reference Bovin, Marx, Weathers, Gallagher, Rodriguez, Schnurr and Keane2016; Weathers et al., Reference Weathers, Litz, Keane, Palmieri, Marx and Schnurr2013). Exclusionary criteria were unstable housing, inability to independently complete activities of daily living, a suicide attempt in the previous 30 days, untreated psychosis or mania, or severe alcohol or drug use that would require continuous medical observation. The study procedures were approved by the Institutional Review Board at Rush University Medical Center with a waiver of consent as all assessments were collected as a part of routine care.

Measures

Veterans were asked to provide demographic information and complete several self-report measures before and during the ITP. A complete list of all features that were used in the different analytic models as well as when they were assessed in ITP can be found in Table 2.

Table 2. List of features used in machine learning models

Clinician administered PTSD scale for DSM-5(CAPS-5)

The CAPS-5 is a structured diagnostic PTSD assessment based on the DSM-5 criteria, administered at baseline (Weathers et al., Reference Weathers, Bovin, Lee, Sloan, Schnurr, Kaloupek and Marx2018). It assesses the severity of PTSD symptoms across the four different clusters from 0 (absent) to 4 (extreme): intrusions, avoidance, alterations in cognition and mood, and hyperarousal. PTSD symptom severity was based on the past month. Cronbach's alpha within the current sample was 0.780.

PTSD checklist for DSM-5 (PCL-5)

The PCL-5 is a self-report measure that assess PTSD severity (Weathers et al., Reference Weathers, Litz, Keane, Palmieri, Marx and Schnurr2013). Individuals were asked to rate how much they were bothered by each of the 20 PTSD symptoms from 0 (not at all) to 4 (extremely). PTSD symptom severity was rated based on the past month during the intake and past week at every other timepoint after that. In the 3-week program, the PCL-5 was assessed at baseline and on days 2, 3, 5, 6, 8, 10, 11, 13, and post-treatment. A total score of 33 is considered the threshold for ‘probable PTSD.’ Cronbach's alphas ranged from 0.897-0.962 across study timepoints.

Patient health questionnaire (PHQ-9)

The PHQ-9 is a 9-item self-report measure of depressive symptoms (Kroenke, Spitzer, & Williams, Reference Kroenke, Spitzer and Williams2001). Individuals were asked to rate how much they were bothered by their depression symptoms from 0 (not at all) to 3 (nearly every day). For the present study, depression symptoms were assessed for the past two weeks at baseline. Cronbach's alpha within the current sample was 0.810.

Posttrauma cognition inventory (PTCI)

The PTCI is a 33-item self-report measure of negative posttrauma cognitions was administered at baseline (Foa, Ehlers, Clark, Tolin, & Orsillo, Reference Foa, Ehlers, Clark, Tolin and Orsillo1999). Individuals were asked to rate how much they agreed or disagreed with a range of beliefs from 1 (totally disagree) to 7 (totally agree). Cronbach's alpha among study participants was 0.951.

Alcohol use disorder identification test – consumption (AUDIT-C)

The AUDIT-C is a 3-item self-report measure of alcohol consumption (Bush, Kivlahan, McDonell, Fihn, & Bradley, Reference Bush, Kivlahan, McDonell, Fihn and Bradley1998). Individuals were asked to rate how often they drank, how many drinks they had when they were drinking, and how often they had six or more drinks on one occasion. The AUDIT-C assessed alcohol consumption over the past year and was administered at baseline. Cronbach's alpha in this study was 0.866.

Neurobehavioral symptom inventory – 10-item validity scale (VAL-10)

(Vanderploeg et al., Reference Vanderploeg, Cooper, Belanger, Donnell, Kennedy, Hopewell and Scott2014). The VAL-10 is a 10-item self-report scale made up of items from the Neurobehavioral Symptom Inventory, assessed at baseline (Vanderploeg et al., Reference Vanderploeg, Cooper, Belanger, Donnell, Kennedy, Hopewell and Scott2014). The items were selected to identify individuals who may be over-reporting neurobehavioral symptoms. Cronbach's alpha among study participants was 0.907.

Analytic strategy

We employed three mixed-effects-based prediction models designed to accommodate the longitudinal structure inherent to assessment of symptom severity during and at the end of the treatment programFootnote 2. The first, Mixed Bayesian Additive Regression Trees (MixedBART) is a recently developed non-parametric Bayesian approach which accommodates random effects within machine learning. This approach utilizes an ensemble of decision trees to predict response. Priors, which are utilized in Bayesian analyses and represent existing beliefs regarding quantities or distributions in Bayesian analysis, are placed on program parameters, including variable selection probabilities. MixedBART and BART default parameters regarding priors and number of trees, without extensive cross-validation, are generally adequate and outperform other machine learning and statistical methods under many conditions. Based on insight from previous work (Held et al., Reference Held, Schubert, Pridgen, Kovacevic, Montes, Christ and Smith2022b), we used Dirichlet, rather than uniform, priors for variable selection probabilities. This allows models to adapt to the existence of more useful predictors in the dataset, thus accommodating the expectation that clinical features and updated PTSD severity values are likely to be more useful in prediction than demographic features (Held et al., Reference Held, Smith, Bagley, Kovacevic, Steigerwald, Van Horn and Karnik2021, Reference Held, Schubert, Pridgen, Kovacevic, Montes, Christ and Smith2022b). As a Bayesian analytic method, MixBART approaches inference by sampling from the posterior distribution generated computationally utilizing existing data and relevant priors. We used 10 000 posterior draws with 5000 burn-in draws, which was a conservative approach compared to other applications of MixBART (Spanbauer & Sparapani, Reference Spanbauer and Sparapani2021), but aligns with common practices and recommendations in Bayesian analysis (e.g. Raftery & Lewis, Reference Raftery and Lewis1991) and resulted in good overall model convergence. Based on prior recommendations using BART approaches we employed 200 trees (Chipman, George, & McCulloch, Reference Chipman, George and McCulloch2010), though we explored reduced numbers of trees to assess importance of individual features due to the tendency for BART models to potentially incorporate more irrelevant features when the number of trees is large. However, due to overall consistency across models with differing numbers of trees we report results of the primary models utilizing 200 trees hereFootnote 3.

The second approach utilized mixed-effects random forest (MERF; Hajjem et al., Reference Hajjem, Bellavance and Larocque2011, Reference Hajjem, Bellavance and Larocque2014). This tree-based random forest approach accommodates random effects for longitudinal or otherwise clustered data utilizing the expectation-maximization (EM) algorithm, a maximum likelihood estimation method that progresses through stages of estimating latent variables and optimizing the model until convergence is reached. Five-folds cross validation on the training set was applied. We also progressively increased numbers of trees and iterations in training set model development, though asymptotes for the utility of such increases in both appeared to exist at beyond approximately 150 iterations and 200–300 trees.

Finally, linear mixed effects regression models (LMMs) were also explored for machine learning model comparison to traditional statistical model accuracy using the same data. This is an accepted approach to modeling longitudinal data due to its accommodation of random effects and missing data, and less restrictive assumptions (Hedeker & Gibbons, Reference Hedeker and Gibbons2006). For this analysis, we both examined models with all predictors and models utilizing only the top five predictors as defined by both the MixedBART and MERF machine learning programs, which both identified the same five predictors. Since the use of the top five predictors resulted in models that were as accurate as those including more, or all, covariates at every timepoint we examined, we present only these results of the LMM approach. The same strategy of creating and testing a model on the training and test sets, respectively, was utilized in order to remain comparable to the machine learning models.. Cross validation was not used in linear mixed model analyses to best approximate typical applied statistical use of this longitudinal approach.

We randomly split the data from the 3-week ITP approximately 60:40 into training (n = 232) and test (n = 130) datasets. This random split was implemented at the participant-level due to the nesting of timepoint measurements within individuals. Training and test sets did not differ on any demographic or clinical variable (ps > 0.10). The training set was then used to train machine learning and LMM models with all baseline demographic and clinical data (see Table 2) as well as lagged PCL-5 scores predicting post-treatment PCL-5. Following training, we examined prediction accuracy on the test set at baseline as well as at each assessment timepoint (see online Supplementary Table S4 for accuracy using training data). Thus, when examining accuracy on the test set at baseline only baseline predictors were used to predict post-treatment PTSD severity. On program days 3, 5, 6, 8, 10, 11, and 13 all baseline features as well as PCL-5 scores for all days up to, and including, that day's PCL-5 measurement were used. Only PTSD severity score was continuously updated throughout the program. Accuracy of predictions was assessed via R2 and RMSE. Each analytic approach models change longitudinally, though our primary emphasis here is on prediction of post-treatment PTSD severity measurement. For MixedBART these values were obtained via the mean of each participant's predicted values against actual post-treatment PCL-5 scores.

Due to the importance of external validation of prediction models, we examined the predictive accuracy of these three models in a sample of 108 participants who had completed a separate, equally established, 2-week CPT-based ITP with similar programming combining individual CPT with adjunctive services, which has previously been demonstrated to be non-inferior to the 3-week program (Held et al., Reference Held, Smith, Pridgen, Coleman and Klassen2022c). Due to the differences in timeline between the two ITPs, assessment timepoints were mapped onto the existing time points based on proportion of the program that had been completed at each measurement timepoint. The three longitudinal prediction models that were generated with 3-week training data were then used to predict post-treatment PTSD severity in the 2-week ITP using the same updating-prediction model approach. In the 2-week ITP we focused on baseline and mid-program (beginning of week 2) predictions of post-treatment PTSD severity. MixedBART and LMM analyses were conducted using the MxBART and LMER4 packages in R version 4.1.1, and MERF analyses were conducted using the MERF package in Python version 3.6. Figures were created using R.

Results

Veterans in the 3-week ITP improved in PTSD severity by an average of 21.57 points (s.d. = 18.80). Approximately 70% (n = 263) improved by at least 10 points, with 51% (n = 185) finishing treatment below the PCL-5 cutoff of 33. As illustrated in Fig. 1, this constituted meaningful overall change across program timepoints, though considerable variability existed in the amount of change, particularly as treatment progressed. This increase in variability across time is generally expected and illustrates the effect of participants' differential improvement during treatment. The demographic and clinical variables in the models other than PCL-5 accounted for approximately 6% of the variability in treatment response throughout the program beyond what PTSD severity accounted for, indicating that both initial accuracy and improvements in predictions were largely driven by PTSD severity and updated PTSD severity measurements.

Fig. 1. Distributions of PTSD severity across all participants over time.

Note. Violin plot illustrates distribution at each timepoint, with internal box plots representing median and interquartile range.

Both machine learning approaches identified PCL-5, time, baseline PTCI, baseline PHQ-9, and CAPS-5 Intrusions as the most important or utilized features in predicting PTSD severity. Thus, these were used in subsequent LMMs for comparison (see online Supplementary Table S5 for comparison of LMM with all features and only these features). The three analytic approaches to predicting post-treatment PTSD severity closely aligned with regards to accuracy. Baseline predictions of final PCL-5 score on the test sample yielded an overall R2 of 0.18 for final PCL-5 severity score prediction across all three models (see Table 3). As expected, as updated PTSD severity scores became available during treatment, the accuracy of final timepoint predictions increased substantially (see Fig. 2). At the start of the second week of treatment, (Day 6), all models were able to account for roughly half of the variability in post-treatment PTSD severity. This could potentially represent a milestone at which current treatment progress could be reliably determined in the 3-week ITP. By mid-program (Day 8) R2 exceeded 0.60 for all analytic methods.

Fig. 2. Test set PTSD severity and predictive accuracy over time.

Note. Error bars represent 95% confidence intervals.

Table 3. Longitudinal updating models comparison

a LMMs including more predictors were examined but did not outperform the five-predictor model.

b Baseline model contained all baseline data, including intake PCL-5 score.

Results of external validation with the 2-week ITP suggest model predictions were similarly accurate as in the 3-week ITP despite not training the models on these data (see Table 4). Baseline predictions generally accounted for about 20% of the variability in post-treatment PTSD severity. Including PCL-5 data up to mid-program led to being able to account for over half of the variability in final PTSD severity by that point. This supports the generalizability of model predictions to similar, but external, clinical data.

Table 4. External validation results

a Baseline model contained all baseline data, including intake PCL-5 score, mid-program predictions included baseline data plus PCL-5 scores to mid-program.

Discussion

Our results support the utility of updating prediction models of PTSD severity as a potential clinical tool for assessing PTSD treatment progress and to help identify timepoints for altering a participant's treatment approach. Before the 3-week ITP's midpoint, each model was able to account for a large proportion of the variability in post-treatment PTSD severity. This remained true even in an external 2-week ITP sample. These models can provide valuable clinical information that support a precision-medicine approach to PTSD treatment, as the majority of those identified as likely non-responders with some certainty at mid-program were found to be non-responders at the end of treatment (see online Supplementary Fig. S1). Thus, by deploying such relatively low-cost models in clinical practice, a clinician would be able to obtain acceptable near real-time estimates about their patient's likely endpoint PTSD severity. As such, continuously updating prediction models may be helpful in PTSD treatment in general and may be particularly useful for intensive treatments given the rapid nature of this treatment approach and the limited time clinicians have to evaluate data before needing to make treatment decisions.

As illustrated, and commonly seen in treatment, improvement was far from uniform, with the amount of variability in reported PTSD severity increasing across time. Though generally expected in longitudinal studies, this highlights the need for increased attention to individual change, and the utility of assessing such change during treatment. Indeed, change in PTSD severity during the program was clearly the most effective predictor of PTSD severity at endpoint. Other clinical and demographic predictors accounted for approximately 6% of the variability in endpoint PTSD severity, with baseline PTSD severity accounting for both the remaining 14% at baseline and the improvements in these predictions as additional severity measurements became available. Thus, the best predictor of heterogeneity in total treatment response is clearly the amount of improvement that the individual is making during the program. This highlights the importance of models that can effectively accommodate this and the additional assumptions inherent to longitudinal modeling rather than basing treatment decisions entirely on baseline predictors or a pre-determined amount of change that needs to have been reached by mid-treatment without accounting for change trajectories.

Results obtained here do not support the superiority of any specific analytic method utilized, though all models performed at least as well as machine learning models that ignore the longitudinal structure of these data, without the potential bias that can arise when ignoring the lack of independence of observations over time (see online Supplementary Table S2). Linear mixed effects regression models were capable of predicting PTSD outcome severity with the same degree of accuracy as machine learning models. This result joins a wealth of evidence that traditional statistical approaches can perform similarly to machine learning models (Cho et al., Reference Cho, Austin, Ross, Abdel-Qadir, Chicco, Tomlinson and Lee2021; Christodoulou et al., Reference Christodoulou, Ma, Collins, Steyerberg, Verbakel and Van Calster2019; Li et al., Reference Li, Zhou, Dong, Fu, Li, Luan and Peng2021), though, to our knowledge, this study represents the first such application in a continuously updating prediction model for psychiatric treatment response.

Despite similarities in prediction accuracy, unique benefits to each longitudinal approach used exist. LMMs provide easily interpretable slope coefficients and metrics regarding significance of individual predictors. Assumptions, as well as aspects of longitudinal structure such as covariance structure or autocorrelation, are easily assessed with this approach, and missing data is easily accommodated. Conversely, both machine learning approaches may more readily accommodate more predictors in applications involving high dimensional datasets or multiple correlated predictors. An additional well-known benefit of Bayesian approaches is the ability to quantify and visualize uncertainty in estimates. Although the mean predicted value for each participant from the posterior is reported in model output and was utilized above to obtain model accuracy metrics, the credible intervals can also be easily obtained to assess the degree of uncertainty in predictions. However, we found that MixedBART yielded overly optimistic estimates of variability around prediction means, so we would caution against use of program generated credible intervals blindlyFootnote 4.

A number of limitations need to be acknowledged. The use of self-report assessments may have increased variability in reporting and it being the only continuously updated variable may be viewed as a limitation, though our prior work has suggested that updates in other variables did not improve predictions in any meaningful way when including lagged PCL-5 scores. Additionally, the fact that PTSD severity measurements over time explained most of the variability in treatment response may have obscured potential roles of other contributing factors. However, this also highlights the importance of utilizing such updated severity information. Also, sample size considerations for some demographic variables, such as race, reduced power for intensive examination of demographic moderators of treatment response, though our prior work has indicated that such demographic variables generally did not impact treatment response for either the 3- or 2-week ITP (Held et al., Reference Held, Smith, Pridgen, Coleman and Klassen2022c). Only ITP completers were examined, although a completer bias is unlikely since completion rates were quite high (>90%), and completers and non-completers did not differ on any baseline demographic or clinical variables for either sample, except for a difference in race in the 3-week sample. Although use of a 2-week ITP validation sample is a strength of the current analysis, it may be similar in many ways to the original sample that an external sample may not. For example, although the treatment schedule differed between the 3- and 2-week ITPs with the later drastically reducing group treatment and adjunctive service components, both centered around CPT (Held et al., Reference Held, Smith, Pridgen, Coleman and Klassen2022c). However, demographics breakdown between the two programs indicated that significant differences existed in sex, MST status, race, service era, and baseline PTSD severity (see Table 1). Finally, although many of the exclusion criteria resemble those used in other PTSD treatment, some were specific to ITPs (e.g. stable housing, ability to travel) and may limit the generalizability of the findings presented here.

Conclusion

Considerable additional research is warranted to better understand specific individual factors that could interact with the chosen treatment approach to individualized treatment. However, our demonstration of the use of continuously updating machine learning or predictive modeling using standard longitudinal statistical approaches to assess progress and predict PTSD treatment outcomes shows promise for precision medicine in the field of PTSD. Such models can provide clinicians with information about which patients may progress through treatment as expected or benefit from treatment alterations based on their predicted response. Using the models presented here, such decisions can relatively reliably be made by mid-treatment in the two ITPs we examined. Future research should examine the feasibility of integrating these models into clinical care and systematically testing whether treatment modifications for individuals predicted to have less favorable treatment responses can improve their outcomes, as well as whether findings generalize to more traditional weekly treatment and/or evidence-based PTSD treatments.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S0033291722002689.

Financial support

This work was supported by the Road Home Program at Rush's partnership with Wounded Warrior Project®.

Conflict of interest

Philip Held receives grant support from Wounded Warrior Project® and RTI International. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, Wounded Warrior Project®, or any other funding agency.

All other authors declare that they have no competing interests.

Ethical standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

Footnotes

The notes appear after the main text.

1 Some participants (i.e., 27% of the original sample) were excluded due to missing covariate data. Data were missing due to a number of reasons, including participants missing assessment sessions due to sickness. However, as this is a sample of program completers, covariate and outcome missingness was not associated with any relevant predictors or PTSD severity and can likely be considered missing at random (MAR). Because of our interest in approximating clinical applications that may require complete cases in machine learning approaches, only complete-case results using listwise deletion are reported here. As a sensitivity analysis we examined robustness of presented results to results using imputation (MICE) of baseline covariates. See online Supplementary Table S1 for results of this sensitivity analysis.

2 Traditional machine learning models using updated PCL-5 scores at each timepoint were also explored with the same training and test sets predicting PCL-5 at final measurement for comparison. These included Random Forest with 5-folds cross validation on the training set. We provide a sample of these results in online supplementary Table S2. Performance was similar, though R 2 values were generally slightly lower, in these models ignoring the longitudinal structure of data.

3 Variations on priors for random effects error estimates were also explored as recommended, including degrees of freedom for the inverse chi-squared distribution between three and ten, as well as priors on probabilistic structure of regression trees from cross-validation, allowing for variable selection probabilities to be equal across predictors, as well as differing specification of probabilistic structures of regression trees. Test-set performance metrics were largely unaffected, except that models that assumed equal importance of priors were always poorer predictors than this using Dirichlet priors (see online supplementary Table S3).

4 Although default settings were used for performance metrics, calibration resulted in more accurate test-set predictions. Estimates provided with regards to credible interval estimation are those obtained following tuning of priors for variability estimates and calibration via linear transformation of predicted values. Regardless of calibration status, we found credible intervals were often too narrow, and that between 74 and 86% of the MixedBART-generated 94% credible intervals contained actual final PCL-5 scores.

References

Aafjes-van Doorn, K., Kamsteeg, C., Bate, J., & Aafjes, M. (2021). A scoping review of machine learning in psychotherapy research. Psychotherapy Research, 31(1), 92116. https://doi.org/10.1080/10503307.2020.1808729.CrossRefGoogle ScholarPubMed
APA. (2017). Clinical practice guideline for the treatment of posttraumatic stress disorder (PTSD) in adults. American Psychological Association. https://www.apa.org/ptsd-guideline.Google Scholar
Asmundson, G. J. G., Thorisdottir, A. S., Roden-Foreman, J. W., Baird, S. O., Witcraft, S. M., Stein, A. T., … Powers, M. B. (2019). A meta-analytic review of cognitive processing therapy for adults with posttraumatic stress disorder. Cognitive Behaviour Therapy, 48(1), 114. https://doi.org/10.1080/16506073.2018.1522371.CrossRefGoogle ScholarPubMed
Blevins, C. A., Weathers, F. W., Davis, M. T., Witte, T. K., & Domino, J. L. (2015). The posttraumatic stress disorder checklist for DSM-5 (PCL-5): Development and initial psychometric evaluation: Posttraumatic stress disorder checklist for DSM-5. Journal of Traumatic Stress, 28(6), 489498. https://doi.org/10.1002/jts.22059.CrossRefGoogle ScholarPubMed
Bovin, M. J., Marx, B. P., Weathers, F. W., Gallagher, M. W., Rodriguez, P., Schnurr, P. P., & Keane, T. M. (2016). Psychometric properties of the PTSD checklist for diagnostic and statistical manual of mental disorders–fifth edition (PCL-5) in veterans. Psychological Assessment, 28(11), 13791391. https://doi.org/10.1037/pas0000254.CrossRefGoogle ScholarPubMed
Bush, K., Kivlahan, D. R., McDonell, M. B., Fihn, S. D., & Bradley, K. A. (1998). The AUDIT alcohol consumption questions (AUDIT-C): An effective brief screening test for problem drinking. Archives of Internal Medicine, 158(16), 17891795. https://doi.org/10.1001/archinte.158.16.1789.CrossRefGoogle ScholarPubMed
Chekroud, A. M., Bondar, J., Delgadillo, J., Doherty, G., Wasil, A., Fokkema, M., … Choi, K. (2021). The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry, 20(2), 154170. https://doi.org/10.1002/wps.20882.CrossRefGoogle ScholarPubMed
Chipman, H. A., George, E. I., & McCulloch, R. E. (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1), 266–298. https://doi.org/10.1214/09-AOAS285.CrossRefGoogle Scholar
Cho, S. M., Austin, P. C., Ross, H. J., Abdel-Qadir, H., Chicco, D., Tomlinson, G., … Lee, D. S. (2021). Machine learning compared with conventional statistical models for predicting myocardial infarction readmission and mortality: A systematic review. Canadian Journal of Cardiology, 37(8), 12071214. https://doi.org/10.1016/j.cjca.2021.02.020.CrossRefGoogle ScholarPubMed
Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., & Van Calster, B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 1222. https://doi.org/10.1016/j.jclinepi.2019.02.004.CrossRefGoogle ScholarPubMed
Cicerone, K. D., & Kalmar, K. (1995). Persistent postconcussion syndrome: The structure of subjective complaints after mild traumatic brain injury. Journal of Head Trauma Rehabilitation, 10(3), 117. https://doi.org/10.1097/00001199-199510030-00002.CrossRefGoogle Scholar
Delgadillo, J. (2021). Machine learning: A primer for psychotherapy researchers. Psychotherapy Research, 31(1), 14. https://doi.org/10.1080/10503307.2020.1859638.CrossRefGoogle ScholarPubMed
Dewar, M., Paradis, A., & Fortin, C. A. (2020). Identifying trajectories and predictors of response to psychotherapy for post-traumatic stress disorder in adults: A systematic review of literature. Canadian Journal of Psychiatry. Revue Canadienne de Psychiatrie, 65(2), 7186. https://doi.org/10.1177/0706743719875602.Google ScholarPubMed
Foa, E. B., Ehlers, A., Clark, D. M., Tolin, D. F., & Orsillo, S. M. (1999). The posttraumatic cognitions inventory (PTCI): Development and validation. Psychological Assessment, 11(3), 303314. https://doi.org/10.1037/1040-3590.11.3.303.CrossRefGoogle Scholar
Galovski, T. E., Harik, J. M., Blain, L. M., Farmer, C., Turner, D., & Houle, T. (2016). Identifying patterns and predictors of PTSD and depressive symptom change during cognitive processing therapy. Cognitive Therapy and Research, 40(5), 617626. https://doi.org/10.1007/s10608-016-9770-4.CrossRefGoogle Scholar
Galovski, T. E., Werner, K. B., Weaver, T. L., Morris, K. L., Dondanville, K. A., Nanney, J., … Iverson, K. M. (2021). Massed cognitive processing therapy for posttraumatic stress disorder in women survivors of intimate partner violence. Psychological Trauma: Theory, Research, Practice and Policy., 769–779. https://doi.org/10.1037/tra0001100.Google ScholarPubMed
Hajjem, A., Bellavance, F., & Larocque, D. (2011). Mixed effects regression trees for clustered data. Statistics & Probability Letters, 81(4), 451459. https://doi.org/10.1016/j.spl.2010.12.003.CrossRefGoogle Scholar
Hajjem, A., Bellavance, F., & Larocque, D. (2014). Mixed-effects random forest for clustered data. Journal of Statistical Computation and Simulation, 84(6), 13131328. https://doi.org/10.1080/00949655.2012.741599.CrossRefGoogle Scholar
Hedeker, D. R., & Gibbons, R. D. (2006). Longitudinal data analysis. Hoboken, NJ: Wiley-Interscience.Google Scholar
Held, P., Bagley, J. M., Klassen, B. J., & Pollack, M. H. (2019). Intensively delivered cognitive-behavioral therapies: An overview of a promising treatment delivery format for PTSD and other mental health disorders. Psychiatric Annals, 49(8), 339342. https://doi.org/10.3928/00485713-20190711-01.CrossRefGoogle Scholar
Held, P., Klassen, B. J., Boley, R. A., Wiltsey Stirman, S., Smith, D. L., Brennan, M. B., … Zalta, A. K. (2020a). Feasibility of a 3-week intensive treatment program for service members and veterans with PTSD. Psychological Trauma: Theory, Research, Practice and Policy, 12(4), 422430. https://doi.org/10.1037/tra0000485.CrossRefGoogle ScholarPubMed
Held, P., Kovacevic, M., Petrey, K., Meade, E. A., Pridgen, S., Montes, M., … Karnik, N. S. (2022a). Treating posttraumatic stress disorder at home in a single week using 1-week virtual massed cognitive processing therapy. Journal of Traumatic Stress, 35, 12151225. https://doi.org/10.1002/jts.22831.CrossRefGoogle Scholar
Held, P., Schubert, R. A., Pridgen, S., Kovacevic, M., Montes, M., Christ, N. M., … Smith, D. L. (2022b). Who will respond to intensive PTSD treatment? A machine learning approach to predicting response prior to starting treatment. Journal of Psychiatric Research, 151, 7885. https://doi.org/10.1016/j.jpsychires.2022.03.066.CrossRefGoogle ScholarPubMed
Held, P., Smith, D. L., Bagley, J. M., Kovacevic, M., Steigerwald, V. L., Van Horn, R., & Karnik, N. S. (2021). Treatment response trajectories in a three-week CPT-based intensive treatment for veterans with PTSD. Journal of Psychiatric Research, 141, 226232. https://doi.org/10.1016/j.jpsychires.2021.07.004.CrossRefGoogle Scholar
Held, P., Smith, D. L., Pridgen, S., Coleman, J. A., & Klassen, B. J. (2022c). More is not always better: 2 weeks of intensive cognitive processing therapy-based treatment are noninferior to 3 weeks. Psychological Trauma: Theory, Research, Practice, and Policy. https://doi.org/10.1037/tra0001257.Google ScholarPubMed
Held, P., Zalta, A. K., Smith, D. L., Bagley, J. M., Steigerwald, V. L., Boley, R. A., … Pollack, M. H. (2020b). Maintenance of treatment gains up to 12-months following a three-week cognitive processing therapy-based intensive PTSD treatment programme for veterans. European Journal of Psychotraumatology, 11(1), 1789324. https://doi.org/10.1080/20008198.2020.1789324.CrossRefGoogle ScholarPubMed
Hilbert, K., Kunas, S. L., Lueken, U., Kathmann, N., Fydrich, T., & Fehm, L. (2020). Predicting cognitive behavioral therapy outcome in the outpatient sector based on clinical routine data: A machine learning approach. Behaviour Research and Therapy, 124, 103530. https://doi.org/10.1016/j.brat.2019.103530.CrossRefGoogle Scholar
ISTSS. (2017). Posttraumatic stress disorder prevention and treatment guidelines: Methodology and recommendations. International Society of Traumatic Stress Studies. https://istss.org/clinical-resources/treating-trauma/new-istss-prevention-and-treatmentguidelines.Google Scholar
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606613. https://doi.org/10.1046/j.1525-1497.2001.016009606.x.CrossRefGoogle ScholarPubMed
Li, J., Zhou, Z., Dong, J., Fu, Y., Li, Y., Luan, Z., & Peng, X. (2021). Predicting breast cancer 5-year survival using machine learning: A systematic review. PLoS ONE, 16(4), e0250370. https://doi.org/10.1371/journal.pone.0250370.CrossRefGoogle ScholarPubMed
Lloyd, D., Couineau, A.-L., Hawkins, K., Kartal, D., Nixon, R. D. V., Perry, D., & Forbes, D. (2015). Preliminary outcomes of implementing cognitive processing therapy for posttraumatic stress disorder across a national veterans’ treatment service. The Journal of Clinical Psychiatry, 76(11), e1405e1409. https://doi.org/10.4088/JCP.14m09139.CrossRefGoogle ScholarPubMed
Monson, C. M., Schnurr, P. P., Resick, P. A., Friedman, M. J., Young-Xu, Y., & Stevens, S. P. (2006). Cognitive processing therapy for veterans with military-related posttraumatic stress disorder. Journal of Consulting and Clinical Psychology, 74(5), 898907. https://doi.org/10.1037/0022-006X.74.5.898.CrossRefGoogle ScholarPubMed
Nixon, R. D. V., King, M. W., Smith, B. N., Gradus, J. L., Resick, P. A., & Galovski, T. E. (2021). Predicting response to cognitive processing therapy for PTSD: A machine-learning approach. Behaviour Research and Therapy, 144, 103920. https://doi.org/10.1016/j.brat.2021.103920.CrossRefGoogle ScholarPubMed
Raftery, A. E., & Lewis, S. (1991). How many iterations in the Gibbs sampler?. Fort Belvoir, VA: Defense Technical Information Center. https://doi.org/10.21236/ADA640705.CrossRefGoogle Scholar
Resick, P. A., Monson, C. M., & Chard, K. M. (2017a). Cognitive processing therapy for PTSD: A comprehensive manual. New York: Guilford Press.Google Scholar
Resick, P. A., Nishith, P., Weaver, T. L., Astin, M. C., & Feuer, C. A. (2002). A comparison of cognitive-processing therapy with prolonged exposure and a waiting condition for the treatment of chronic posttraumatic stress disorder in female rape victims. Journal of Consulting and Clinical Psychology, 70(4), 867879.CrossRefGoogle Scholar
Resick, P. A., Uhlmansiek, M. O., Clum, G. A., Galovski, T. E., Scher, C. D., & Young-Xu, Y. (2008). A randomized clinical trial to dismantle components of cognitive processing therapy for posttraumatic stress disorder in female victims of interpersonal violence. Journal of Consulting and Clinical Psychology, 76(2), 243258. https://doi.org/10.1037/0022-006X.76.2.243.CrossRefGoogle ScholarPubMed
Resick, P. A., Wachen, J. S., Dondanville, K. A., Pruiksma, K. E., Yarvis, J. S., Peterson, A. L., … Young-McCaughan, S. (2017b). Effect of group vs individual cognitive processing therapy in active-duty military seeking treatment for posttraumatic stress disorder: A randomized clinical trial. JAMA Psychiatry, 74(1), 28. https://doi.org/10.1001/jamapsychiatry.2016.2729.CrossRefGoogle ScholarPubMed
Resick, P. A., Wachen, J. S., Mintz, J., Young-McCaughan, S., Roache, J. D., Borah, A. M., … Peterson, A. L. (2015). A randomized clinical trial of group cognitive processing therapy compared with group present-centered therapy for PTSD among active duty military personnel. Journal of Consulting and Clinical Psychology, 83(6), 10581068. https://doi.org/10.1037/ccp0000016.CrossRefGoogle ScholarPubMed
Resick, P. A., Williams, L. F., Suvak, M. K., Monson, C. M., & Gradus, J. L. (2012). Long-term outcomes of cognitive-behavioral treatments for posttraumatic stress disorder among female rape survivors. Journal of Consulting and Clinical Psychology, 80(2), 201210. https://doi.org/10.1037/a0026602.CrossRefGoogle ScholarPubMed
Schumm, J. A., Walter, K. H., & Chard, K. M. (2013). Latent class differences explain variability in PTSD symptom changes during cognitive processing therapy for veterans. Psychological Trauma: Theory, Research, Practice, and Policy, 5(6), 536544. https://doi.org/10.1037/a0030359.CrossRefGoogle Scholar
Shatte, A. B., Hutchinson, D. M., & Teague, S. J. (2019). Machine learning in mental health: a scoping review of methods and applications. Psychological medicine, 49(9), 14261448.CrossRefGoogle ScholarPubMed
Spanbauer, C., & Sparapani, R. (2021). Nonparametric machine learning for precision medicine with longitudinal clinical trials and Bayesian additive regression trees with mixed models. Statistics in Medicine, 40(11), 26652691. https://doi.org/10.1002/sim.8924.CrossRefGoogle ScholarPubMed
VA/DoD. (2017). VA/DOD clinical practice guideline for the management of posttraumatic stress disorder and acute stress disorder. Department of Veterans Affairs, Department of Defense. https://www.healthquality.va.gov/guidelines/MH/ptsd/VADoDPTSDCPGFinal.pdf.Google Scholar
Vanderploeg, R. D., Cooper, D. B., Belanger, H. G., Donnell, A. J., Kennedy, J. E., Hopewell, C. A., & Scott, S. G. (2014). Screening for postdeployment conditions: Development and cross-validation of an embedded validity scale in the neurobehavioral symptom inventory. Journal of Head Trauma Rehabilitation, 29(1), 110. https://doi.org/10.1097/HTR.0b013e318281966e.CrossRefGoogle ScholarPubMed
Varker, T., Kartal, D., Watson, L., Freijah, I., O'Donnell, M., Forbes, D., … Hinton, M. (2020). Defining response and nonresponse to posttraumatic stress disorder treatments: A systematic review. Clinical Psychology: Science and Practice, 27(4), e12355. https://doi.org/10.1037/h0101781.Google Scholar
Weathers, F. W., Bovin, M. J., Lee, D. J., Sloan, D. M., Schnurr, P. P., Kaloupek, D. G., … Marx, B. P. (2018). The clinician-administered PTSD scale for DSM–5 (CAPS-5): Development and initial psychometric evaluation in military veterans. Psychological Assessment, 30(3), 383395. https://doi.org/10.1037/pas0000486.CrossRefGoogle ScholarPubMed
Weathers, F. W., Litz, B. T., Keane, T. M., Palmieri, P. A., Marx, B. P., & Schnurr, P. P. (2013). The PTSD Checklist for DSM-5 (PCL-5). Scale available from the National Center for PTSD at www.ptsd.va.gov.Google Scholar
Zalta, A. K., Held, P., Smith, D. L., Klassen, B. J., Lofgreen, A. M., Normand, P. S., … Karnik, N. S. (2018). Evaluating patterns and predictors of symptom change during a three-week intensive outpatient treatment for veterans with PTSD. BMC Psychiatry, 18(1), 242. https://doi.org/10.1186/s12888-018-1816-6.CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Demographic characteristics

Figure 1

Table 2. List of features used in machine learning models

Figure 2

Fig. 1. Distributions of PTSD severity across all participants over time.Note. Violin plot illustrates distribution at each timepoint, with internal box plots representing median and interquartile range.

Figure 3

Fig. 2. Test set PTSD severity and predictive accuracy over time.Note. Error bars represent 95% confidence intervals.

Figure 4

Table 3. Longitudinal updating models comparison

Figure 5

Table 4. External validation results

Supplementary material: File

Smith and Held supplementary material

Tables S1-S5 and Figure S1

Download Smith and Held supplementary material(File)
File 24.5 KB