Introduction
There is a need for comprehensive data resources on population health and disease in low- and middle-income countries, where a large proportion of the global burden of morbidity and mortality is located (Reference Wang, Naghavi, Allen, Barber, Bhutta and Carter1, Reference Vos, Allen, Arora, Barber, Bhutta and Brown2). Biomarker data form an essential component of such endeavours, allowing objective assessment of a wide range of disease-related indices, facilitating validation of self-reported information, and allowing for greater statistical power of analyses. Integration of biomarker data with information on health and lifestyle provides a powerful tool to enhance the scientific value of health research.
Large-scale surveys in low- and middle-income populations, such as the Demographic and Health Surveys, have previously included biomarker modules (3). However, these have often been restricted to a narrow range of measures from limited samples, with variable capacity for long-term storage and later analysis (3). Importantly, they are unable to follow up individuals over time. Health and demographic surveillance system (HDSS) sites offer a valuable opportunity for efficient, large-scale collection and analysis of biomarker data. They provide pre-existing infrastructure to facilitate biological sample collection, and the potential to link biomarker data longitudinally to historical and future measures. This linkage allows for a detailed view of disease development across the life course (Reference Power, Kuh and Morton4).
We undertook a biomarker feasibility study embedded within the South East Asia Community Observatory (SEACO) HDSS, which covers approximately 45 000 individuals in Segamat, Malaysia (Reference Partap, Young, Allotey, Soyiri, Jahan and Komahan5). The SEACO HDSS conducts annual enumeration of individuals, and has also undertaken a population-wide health survey collecting questionnaire data and biophysical measurements, in its catchment area (Reference Partap, Young, Allotey, Soyiri, Jahan and Komahan5). Through this study, we explored the feasibility of building upon the previous survey work conducted by SEACO to include biological sample collection. This feasibility study aimed to recruit approximately 200 individuals aged seven years and above to assess the preparedness of individuals and families to participate, and to establish the procedures for the collection, analysis and storage of biological samples within a predominantly rural community setting. Here, we outline the developments in the procedures and examine the outcomes of this study to determine the potential to create a large-scale biodata resource within the full HDSS population.
Methods
A detailed profile of the SEACO HDSS, including the HDSS development, structure, and data collections, is presented in a recent publication (Reference Partap, Young, Allotey, Soyiri, Jahan and Komahan5).
Sampling
Adult (aged 18 years and over) and child (aged 7–17 years) participants for this study were recruited from the SEACO HDSS (Reference Partap, Young, Allotey, Soyiri, Jahan and Komahan5). Stratified random sampling was performed at the household level using data from the most recent enumeration (completed in 2016), aiming to achieve comparable proportions of individuals of Malay, Indian, Chinese and Orang Asli (indigenous) ethnicity. Sampling therefore covered all enumerated households within the SEACO catchment area (approximately 1250 km2). SEACO has established strong community links through its community engagement strategy (Reference Allotey, Reidpath, Devarajan, Rajagobal, Yasin and Arunachalam6), and additional community awareness activities were undertaken to sensitise potential participants prior to this study.
Data and sample collection
Community-based data and sample collection was undertaken by two field teams between November 2016 and February 2017. Data were recorded on electronic tablets. Informed consent (adults) or informed assent with parental or guardian consent (children) was first obtained; individuals could only participate if they consented to providing all data and samples (Supplementary Methods). Following informed consent, along with questionnaire and biophysical data, capillary blood (via finger prick, for point-of-care glycated haemoglobin [HbA1c] measurement), and venous blood (four tubes from a single blood draw: up to 24 ml from adults, 12 ml from children; for serum, plasma and whole blood samples) were collected from participants. Hair and urine samples were also collected from adult participants. Following data and sample collection, participants were given their body mass index (BMI), blood pressure and point-for-care HbA1c results, and were provided referral to local clinics if these were above pre-determined cut-offs. One session of data and sample collection took approximately 40–50 minutes for adult participants and 30 minutes for children (see Supplementary Methods for further details on sample collection purposes and procedures).
Measures and statistical analysis
Study measures to evaluate scale-up
Literature on suitable measures or assessment frameworks to determine feasibility for population-based observational studies is scarce (Reference Knipe, Jayasumana, Siribaddana, Priyadarshana, Pearson and Gunnell7–Reference Gupta, Ranjit, Shrestha, Wong, Robinson and Shrestha10). We therefore identified and examined a range of study-related measures to gain a comprehensive picture of the potential for scale-up. This included indicators of efficiency, response from potential participants, feedback from participants, and completeness and quality of collected data and samples.
First, we summarised study operational data to assess operational efficiency and response to the study. This assessment included information on the number of days of data and sample collection; the number and demographic characteristics of households and individuals approached; proportions consenting, declining or absent; reasons for refusal among those declining participation; and post-study feedback among participating individuals. Study pace was calculated as the average number of participants recruited per day. Differences in demographic characteristics between consenting and non-consenting individuals were assessed using Pearson's chi squared tests or Fisher's exact tests (cell counts less than five).
Following this, we examined measures relating to quality and completeness of data and samples. We were particularly interested in measures relating to blood sample collection, availability of blood test data and availability of blood sample aliquots, as indicators of the success of sample collection, analysis and storage. We extracted relevant information from three datasets generated at the end of the study: (i) data recorded on the electronic questionnaire form, (ii) blood test results, and (iii) records of receipt, processing and aliquoting of biological samples at the central research laboratory. All three datasets were cleaned, merged and checked for consistency. The completeness of questionnaire data for each participant was assessed by examining a set of all questions and measurements collected from all participants. The number of participants with any questionnaire data, blood test data, collected samples and samples for storage (plasma, serum, whole blood and remnant cell aliquots, urine aliquots and hair samples) was examined, and differences by sex, ethnicity and obesity status were assessed. The number of participants with complete data and samples was similarly examined.
Sociodemographic, lifestyle and risk factor data
Finally, sociodemographic characteristics of study participants and crude prevalence of key lifestyle, biophysical and blood-based risk factors in the population were examined; differences by sex were assessed using Pearson's chi squared or Fisher's exact tests (see Supplementary Methods for list of variables and corresponding definitions).
All data management and analyses were performed using Stata 14 (Statacorp, Texas).
Ethical approvals
Ethical approval for the study was obtained from the Monash University Human Research Ethics Committee (CF16/471–2016000227), and approval for the receipt and analysis of linked anonymised data at the University of Cambridge was obtained from the University Human Biology Research Ethics Committee (HBREC.2017.04) (Supplementary Methods).
Results
Study measures to evaluate scale-up
Measures of study recruitment and response
Overall, 203 participants (161 adults, 42 children) were recruited into the biomarker feasibility study, close to half (49.5%) of those responding to an invitation to participate (Figure 1, Table 1). A notable proportion of houses was empty upon approach, either due to the household having moved away (n = 107; 11.7%), or household members not being at home (n = 383; 42.0%) (Figure 1). Recruitment and data and sample collection occurred over 48 working days, with an overall study pace of 4.2 participants per day (Supplementary Figures S1–S2). Among households providing reasons for refusal to participate (63.5%), the most common included disinterest (16.2%) or fear of needles (24.3%) (Supplementary Table S1).
1Individuals aged seven years or above and covered in the most recent HDSS enumeration, who had not moved away or passed away since the most recent enumeration.
2Eligible individuals who were available to respond to an invitation to participate in the study at the time of visit.
Differences in distributions across categories of response were compared using Pearson's chi squared or Fisher's exact (cell counts < 5) tests.
A greater proportion of women (56%) versus men, individuals aged 50–59 years (70.1%) or 60 years and above (64.7%) versus younger individuals, and those of Orang Asli ethnicity (64.9% among adults, 70.5% among children) versus those of any other ethnicity were available during recruitment (Table 1). Of those available and subsequently invited, women (68.5%, P < 0.001) were more likely to consent to participate compared with men, whereas children (30.0%) and young adults (48.2%), and those of Malay ethnicity (adults: 41.3%, P < 0.001, children: 19.0%, P = 0.129) were less likely to consent, compared with older individuals or those of any other ethnicity (Table 1).
Of 170 (83.7%) participants providing post-study feedback, over 95% agreed with comments relating to a favourable experience, including comfort during questionnaire administration (99.4%), interest in the study results (100.0%), and willingness to encourage others to participate in the study (99.4%) (Supplementary Table S2).
Completeness and quality of data and samples
We then examined the availability of data and samples collected from participants. All participants had some available questionnaire information, with most having three or fewer missing variables (Table 2, Supplementary Tables S3-S4). At least one biological sample (capillary blood, venous blood, hair or urine) was collected at the anticipated quantity from all individuals (Table 2, Supplementary Table S5). Over 90% of participants had some blood test data, whilst approximately 70–80% had complete data (Table 2), with no systematic differences in data and sample availability by ethnicity (Supplementary Figures S3–S4).
EDTA: ethylene diamine tetra-acetic acid.
1At least one of: plain serum or EDTA (plasma) or EDTA (whole blood 1) or EDTA (whole blood 2).
2All of: plain serum and EDTA (plasma) and EDTA (whole blood 1) and EDTA (whole blood 2).
Given the potential to obtain detailed biomarker information from blood, the availability and quality of blood samples was of particular interest in this study. A capillary (finger-prick) blood sample was successfully collected from all participants, with successful point-of-care HbA1c measurement in almost all (99.0%) participants (Table 2). At least one venous blood sample of any volume was collected from over 90% of both adult and child participants, with 82.6% of adults and 95.2% of children having all four blood samples collected at any volume (Table 3; Supplementary Tables S6-S7). Notably, obese adults were less likely to have blood samples successfully collected (at least one blood sample at any volume: 100% among non-obese adults versus 79.5% among obese adults, P = 0.002) (Supplementary Table S8). Almost all collected blood samples passed as acceptable quality by the research laboratory, for processing, analysis and storage (Table 3; Supplementary Tables S6-S7). At least one storage aliquot was available from all collected and accepted blood samples among children, and over 96.2% of samples among adults (Supplementary Tables S9-S10).
EDTA: ethylene diamine tetra-acetic acid.
Sociodemographic, lifestyle and risk factor data
In addition to a notable prevalence of lifestyle and biophysical risk factors, we found a high burden of blood-based cardiometabolic risk factors in this population. Close to one quarter of adults (23.8%) had elevated HbA1c, while 8.2% had elevated total cholesterol, 15.0% had low HDL cholesterol, and 38.1% had elevated triglycerides (Table 4). Risk factor prevalence was similarly high among children: 19.5% had elevated total cholesterol, 14.6% had low HDL cholesterol and 36.6% had elevated triglycerides (Table 4).
HbA1c: glycated haemoglobin. HDL: high-density lipoprotein.
Classification of all risk factors is described in the Supplementary Methods.
Differences in distributions between men and women or boys and girls were assessed using Pearson's chi squared or Fisher's exact (cell counts < 5) test.
N was reduced due to missing observations for the following measures: (Reference Wang, Naghavi, Allen, Barber, Bhutta and Carter1) Low fruit and vegetable consumption among girls (N = 18); (Reference Vos, Allen, Arora, Barber, Bhutta and Brown2) Overweight, obesity, central obesity and elevated waist to hip ratio and elevated HbA1c among women (N = 101); (3) Elevated HbA1c in girls (N = 18); (Reference Power, Kuh and Morton4) All cholesterol and triglyceride measures among girls (N = 18), men (N = 58) and women (N = 89).
1Measures for hypertension and elevated cholesterol prevalence included individuals who reported being told they had elevated blood pressure or cholesterol.
2HbA1c as measured at the point of care.
Discussion
Detailed, objective measures provided by biomarker information are fundamental to comprehensive data resources on population health and disease. In this study, we show the feasibility of biomarker collection within the context of the SEACO HDSS. Approximately half of invited individuals consented to participate in biological sample collection, with favourable participant feedback. Biological samples were collected from all participants. Outcome measures indicated that there was scope to increase study pace, and a need to improve blood sample collection from obese participants, both attainable through appropriate modifications to study design and training. A high prevalence of blood-based cardiometabolic risk factors was observed among both adult and child participants. These results indicate that creation of a large-scale biodata resource is both achievable and valuable in this population, with potential relevance to similar HDSS sites.
We demonstrate here that capitalising on existing HDSS frameworks to undertake biomarker collection is an efficient way to encourage community participation, and to enhance their value as data resources. We undertook biological sample collection by building upon the strong existing infrastructure, data, human and material resources, local knowledge and community and administrative links established by the SEACO HDSS (Reference Partap, Young, Allotey, Soyiri, Jahan and Komahan5). The proportion of consenting versus invited participants observed in this study is comparable to or greater than other large-scale biobank or biomarker collection studies based in high-income countries (Reference Fry, Littlejohns, Sudlow, Doherty, Adamska and Sprosen11, Reference Day, Oakes, Luben, Khaw, Bingham and Welch12). Participants were willing to provide both capillary and venous blood samples, with successful capillary blood collection for all participating individuals. Blood test data and storage aliquots were available for the majority of participants, indicating the successful establishment of procedures from sample collection to analysis and long-term storage. Data and sample collection took under an hour, and participants providing feedback responded favourably to the study. The community engagement strategy previously established by SEACO provided a mechanism through which individuals could raise and address concerns they had with participation in this study (Reference Allotey, Reidpath, Devarajan, Rajagobal, Yasin and Arunachalam6). Importantly, we have the capacity to link information obtained in this study with measures from both previous and future HDSS data collections, including later clinical outcomes, which will facilitate the creation of richer datasets that may be explored in future analyses.
Compared with the growing focus on feasibility studies for randomised clinical trials (Reference Tickle-Degnen13–Reference Lancaster, Campbell, Eldridge, Farrin, Marchant and Muller24), literature on operational outcomes of observational feasibility studies remains scarce, and restricted to a limited number of measures, such as the overall proportion of invited individuals ultimately participating (Reference Knipe, Jayasumana, Siribaddana, Priyadarshana, Pearson and Gunnell7–Reference Gupta, Ranjit, Shrestha, Wong, Robinson and Shrestha10). Few studies have directly assessed measures of sample collection feasibility, with none identified here that specifically examined blood sample collection (Reference Knipe, Jayasumana, Siribaddana, Priyadarshana, Pearson and Gunnell7, Reference Sy25). Here, we identified useful indicators relating to various aspects of study operation including sample collection, using these in the context of our study to obtain a clearer understanding of the feasibility of scale-up. Systematic assessment of such measures may be useful to researchers planning similar data and sample collections in other low- and middle-income populations.
While most outcomes assessed here indicated successful establishment of study operations, we identified two areas requiring improvement, which may be successfully addressed through simple modifications to study design and training. This included the slow study pace relative to the number of field teams and time taken per session of data and sample collection. This survey design-related issue was likely a result of the notable proportion of houses empty upon approach, due to outmigration or unavailability of household members at the time of recruitment. This, along with the predominantly rural setting and large sampling area, increased the travel time between houses with consenting individuals. More suitable methods of recruitment to improve study efficiency could include approaching sampled households in a separate recruitment drive to establish availability, willingness to participate, and to arrange convenient time windows for data and sample collection. We also observed lower blood sample collection success among obese participants, an issue specific to biomarker collection which may be resolved by further directed training of study phlebotomists.
The proportion of participating individuals in this study, along with differential response to participation across demographic subgroups, may suggest implications for generalisability. Although the demographic profile of this study may not be fully representative of the wider population, analyses arising from this study have the capacity to produce internally valid results regarding aetiological relationships, with wider relevance to other populations (Reference Fry, Littlejohns, Sudlow, Doherty, Adamska and Sprosen11). Nonetheless, our observations indicate an opportunity to further improve recruitment strategies overall and across specific subgroups, in future data and sample collections.
The high burden of cardiometabolic risk factors observed in the current study population is consistent with previous findings from the SEACO HDSS (Reference Pell, Allotey, Evans, Hardon, Imelda and Soyiri26, Reference Partap, Young, Allotey, Sandhu and Reidpath27). Similar trends have been reported in other middle-income countries including those from Asia, and are thought to be a result of epidemiologic transitions occurring in these populations (Reference Omran28–Reference Pipatvanichgul, Hanchaiphiboolkul, Puthkhao, Tantirittisak and Towanabut31). These observations reinforce the need for large-scale biomarker data from such populations to comprehensively assess disease risk and associated influences across the life course. We demonstrate here that existing HDSS resources can be successfully augmented to achieve this purpose.
We present a study undertaken within a specific context, with basic infrastructure and resources already in place through the SEACO HDSS and augmented by collaborating institutions. Given our context and particular interests, we made specific choices regarding study design, including biological samples of interest, consent structure, the collection of non-fasting blood samples, and test result feedback and onward referral of participants. Researchers planning biomarker collections in other settings must consider their specific contexts and aims to inform decisions relating to suitable study design. Importantly, the measures presented here may be applicable and useful to understanding the feasibility of such biomarker collections regardless of exact study methodology.
To conclude, we show that biological sample collections to create biodata resources using existing HDSS frameworks are feasible. Using this approach, we identify a potentially high burden of cardiometabolic risk factors that requires further evaluation in this population. Building upon existing HDSS resources in this way would greatly enhance their scientific value, and contribute towards addressing the need for comprehensive biomarker data from low- and middle-income populations.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/gheg.2018.13.
Financial support
SEACO is funded by the office of the Vice Provost Research, Monash University Australia; the office of the Deputy Dean Research, Faculty of Medicine, Nursing and Health Sciences, Monash University Australia; the Monash University Malaysia Campus and the Jeffrey Cheah School of Medicine and Health Sciences. SEACO is an associate member of the INDEPTH network.
This work was supported by the Wellcome Trust (grant number 098051). MS is supported by the National Institute for Health Research Cambridge Biomedical Research Centre (UK).
Conflict of interest
None.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.