Hostname: page-component-848d4c4894-nmvwc Total loading time: 0 Render date: 2024-06-26T17:44:32.650Z Has data issue: false hasContentIssue false

318 Discovering Subgroups with Supervised Machine Learning Models for Heterogeneity of Treatment Effect Analysis

Published online by Cambridge University Press:  03 April 2024

Edward Xu
Affiliation:
DePaul University
Joseph Vanghelof
Affiliation:
Rush University Medical Center
Daniela Raicu
Affiliation:
DePaul University
Jacob Furst
Affiliation:
DePaul University
Raj Shah
Affiliation:
Rush University Medical Center
Roselyne Tchoua
Affiliation:
DePaul University
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

OBJECTIVES/GOALS: The goal of the study is to provide insights into the use of machine learning methods as a means to predict heterogeneity of treatment effect (HTE) in participants of randomized clinical trials. METHODS/STUDY POPULATION: Using data from 2,441 participants enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial of daily low-dose aspirin vs placebo in the United States, we developed multivariable risk prediction models for the composite outcome of dementia, disability, or death. We used two machine learning techniques, decision trees and random forests, to develop novel non-parametric outcomes classifiers and generate risk-based subgroups. The comparator method was an extant semi-parametric proportional hazards predictive risk model. We then assessed HTE by examining the 5-year absolute risk reduction (ARR) of aspirin vs placebo in each risk subgroup. RESULTS/ANTICIPATED RESULTS: In the random forest classifier, the ARR at 5 years in the highest risk quintile was 13.7% (95% CI 3.1% to 24.4%). For the semi-parametric proportional hazards model, the ARR in the highest risk quintile was 15.1% (95% CI 4.0% to 26.3%). These results were comparable and provide evidence of the viability of internally developed parsimonious non-parametric machine learning models for HTE analysis. The decision tree model results (5-year ARR = 17.0%, 95% CI= -5.4% to 39.4% in the highest risk subgroup) exhibited more uncertainty in the results. DISCUSSION/SIGNIFICANCE: None of the models detected significant HTE on the relative scale; there was substantial HTE on the absolute scale in three of the models. Treatment benefit on the absolute scale may be regarded as bearing greater clinical importance and may be present even in the absence of benefit on the relative scale.

Type
Informatics and Data Science
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
© The Author(s), 2024. The Association for Clinical and Translational Science