No CrossRef data available.
Article contents
PP04 Assessing The Utility Of Natural Language Processing In Generating A Granular Estimated Indication For A Horizon Scanning Database
Published online by Cambridge University Press: 07 January 2025
Abstract
Detailed, precise information on a pharmaceutical’s projected therapeutic use is required for horizon scanning. Inferring an estimated indication from trial protocols is a key skill of horizon scanners. The International Horizon Scanning Initiative (IHSI) database utilizes semi-automated data collection. This pilot aimed to verify that the extraction of relevant word sets to generate an estimated indication could be semi-automated.
Ten drugs approved in Europe in 2021 were selected as the pilot test set. The test set included drugs approved for the treatment of rare diseases (n=4), haemato-oncology (n=3), and non-oncology conditions (n=3). Eight of the drugs were approved based on phase III trials. The assessment comprised a review of the pivotal trial that supported product registration for these drugs. We undertook a comparison between a human curator and a natural language processing (NLP) algorithm in generating granular tags relating to key aspects of the drugs’ estimated indication (stage of disease, patient-specific subgroup, and place in treatment).
In 50 percent of cases, the NLP accurately tagged a word or word set related to stage of disease, patient-specific subgroup, or place in treatment, which was also tagged by human curators. In 50 percent of cases, the NLP did not identify words or word sets tagged by human curators. Where relevant, the NLP successfully tagged the same word sets relating to stage of disease for all drugs in the test set. The same word sets relating to patient-specific subgroup were successfully tagged for three drugs in the set. NLP successfully tagged word sets relating to place in treatment for two drugs.
The NLP algorithm is successful in extracting relevant word sets, which can be used to generate an estimated indication in an automated or semi-automated process. The pilot highlighted that further testing is required to advance the sensitivity of the algorithm. Further piloting exploring both unsupervised and supervised modeling approaches (named entity recognition and deep neural networks, respectively) is planned.
- Type
- Poster Presentations
- Information
- International Journal of Technology Assessment in Health Care , Volume 40 , Special Issue S1: Abstracts from the HTAi 2024 Meeting in Seville, Spain , December 2024 , pp. S54 - S55
- Creative Commons
- This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
- Copyright
- © The Author(s), 2024. Published by Cambridge University Press