318 Discovering Subgroups with Supervised Machine Learning Models for Heterogeneity of Treatment Effect Analysis

Edward Xu; Joseph Vanghelof; Daniela Raicu; Jacob Furst; Raj Shah; Roselyne Tchoua

doi:10.1017/cts.2024.288

318 Discovering Subgroups with Supervised Machine Learning Models for Heterogeneity of Treatment Effect Analysis

Part of: JCTS_2024_ABSTRACT_COLLECTION

Published online by Cambridge University Press: 03 April 2024

Edward Xu ,

Raj Shah and

Edward Xu: Affiliation:
DePaul University
Joseph Vanghelof: Affiliation:
Rush University Medical Center
Daniela Raicu: Affiliation:
DePaul University
Jacob Furst: Affiliation:
DePaul University
Raj Shah: Affiliation:
Rush University Medical Center
Roselyne Tchoua: Affiliation:
DePaul University

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

OBJECTIVES/GOALS: The goal of the study is to provide insights into the use of machine learning methods as a means to predict heterogeneity of treatment effect (HTE) in participants of randomized clinical trials. METHODS/STUDY POPULATION: Using data from 2,441 participants enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial of daily low-dose aspirin vs placebo in the United States, we developed multivariable risk prediction models for the composite outcome of dementia, disability, or death. We used two machine learning techniques, decision trees and random forests, to develop novel non-parametric outcomes classifiers and generate risk-based subgroups. The comparator method was an extant semi-parametric proportional hazards predictive risk model. We then assessed HTE by examining the 5-year absolute risk reduction (ARR) of aspirin vs placebo in each risk subgroup. RESULTS/ANTICIPATED RESULTS: In the random forest classifier, the ARR at 5 years in the highest risk quintile was 13.7% (95% CI 3.1% to 24.4%). For the semi-parametric proportional hazards model, the ARR in the highest risk quintile was 15.1% (95% CI 4.0% to 26.3%). These results were comparable and provide evidence of the viability of internally developed parsimonious non-parametric machine learning models for HTE analysis. The decision tree model results (5-year ARR = 17.0%, 95% CI= -5.4% to 39.4% in the highest risk subgroup) exhibited more uncertainty in the results. DISCUSSION/SIGNIFICANCE: None of the models detected significant HTE on the relative scale; there was substantial HTE on the absolute scale in three of the models. Treatment benefit on the absolute scale may be regarded as bearing greater clinical importance and may be present even in the absence of benefit on the relative scale.

Type: Informatics and Data Science
Information: Journal of Clinical and Translational Science , Volume 8 , Issue s1 , April 2024 , pp. 97 - 98

DOI: https://doi.org/10.1017/cts.2024.288 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.

Article contents

318 Discovering Subgroups with Supervised Machine Learning Models for Heterogeneity of Treatment Effect Analysis

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests