The Health of the Nation Outcome Scales (HoNOS; Reference Wing, Beevor and CurtisWing et al, 1998) were developed to measure health outcomes in response to the then UK government's target to “improve significantly the health and social functioning of mentally ill people” (Department of Health, 1992). HoNOS was developed to become a standardised assessment tool to be used routinely by all mental health practitioners (Reference Wing, Curtis and BeevorWing et al, 1996). Indeed, HoNOS has recently been formally adopted for use with major Care Programme Approach reviews, and one-third of English trusts surveyed between October 1997 and May 1998 were using HoNOS routinely in one or more service settings (Reference Wing, Lelliott and BeevorWing et al, 2000).
HoNOS and psychological therapy services
Field trials in the development of HoNOS focused on adults with severe mental illness attending in-patient and community psychiatry services. The developers of HoNOS state that it is acceptable, clinically useful, reliable, sensitive to change and useful for administration and planning within such settings (Reference Wing, Beevor and CurtisWing et al, 1998). Recent independent studies, however, have questioned its reliability (Reference Orrell, Yard and HandysidesOrrell et al, 1999), sub-scale structure (Reference TrauerTrauer, 1999), sensitivity to change (Reference Trauer, Callaly and HantzTrauer et al, 1999) and appropriateness for routine clinical use in busy psychiatry services (Reference Bebbington, Brugha and HillBebbington et al, 1999; Reference Sharma, Wilkinson and FearSharma et al, 1999). Versions of HoNOS have been developed for the specialities of child psychiatry (HoNOS for Children and Adolescents: Reference Gowers, Harrington and WhittonGowers et al, 1999) and psychiatry services for the elderly (HoNOS 65+: Reference Burns, Beevor and LelliotBurns et al, 1999), but there is currently no specific version of HoNOS for use in psychotherapy and psychological treatment services.
The above would suggest the need for more evidence on the performance of HoNOS within psychotherapy and psychological treatment settings before it is espoused as the standard outcome measure for routine use by all mental health practitioners. Three specific questions arise. First, does HoNOS provide an adequate baseline assessment of psychotherapy patients' problems? Second, is HoNOS sensitive to changes following treatment? And third, is HoNOS appropriate for informing and guiding service development? These questions were addressed via practitioners belonging to a practice research network (PRN) under the auspices of the Society for Psychotherapy Research (SPR).
METHOD
Setting
The SPR PRN is a multi-site initiative involving a group of psychotherapy practitioners and researchers in the north of England. Network practitioners collaborate to collect, share and utilise standardised clinical and service effectiveness information as part of routine practice (Reference Audin, Mellor-Clark and BarkhamAudin et al, 2001). SPR PRN practitioners incorporated HoNOS into their evaluation instrumentation to test its feasibility for routine use in out-patient and community psychotherapy and psychological treatment services.
The SPR PRN consists of eight National Health Service (NHS) secondary and tertiary out-patient and community psychotherapy services based in Manchester, Bradford, Leeds, Wakefield, York, Liverpool, Preston and Sheffield. The SPR PRN is coordinated by the Psychological Therapies Research Centre at the University of Leeds, which is responsible for analysis and reporting of network data.
The SPR PRN uses a broad definition of psychotherapy as recommended by the NHS Executive's review of NHS psychotherapy services in England (Reference Parry and RichardsonParry & Richardson, 1996). Psychotherapy covers tertiary specialist services, those linked to community mental health teams and those linked directly to primary care. The term also covers a wide range of modalities used by constituent practitioners, including psychodynamic— ‘interpersonal, cognitive—behavioural, systemic and integrative.
Data sample
The mean age of the patient sample (n=1688) was 35.7 years (s.d.=0.6 years) and 60% were female. Almost a quarter (24%) of patients were unemployed and 28% were living alone at the time of the assessment. Almost one-third (31%) of patients assessed had previously received psychological treatment within a secondary-level community service. Over half the patient sample (53%) were referred by a general practitioner and over a quarter (26%) by a psychiatrist. The average waiting time from referral to assessment was 12 weeks.
Health of the Nation Outcome Scales
The 12 HoNOS items are each scored 0-4, yielding a total score in the range 0-48 (Reference Wing, Curtis and BeevorWing et al, 1996). Item 8 (see Appendix) of HoNOS requires that practitioners identify one other mental and behavioural problem from a list of 10: phobic, anxiety, obsessive—compulsive, stress, dissociative, somatoform, eating, sleep, sexual, and other. This list appeared particularly relevant to the psychotherapy patient population but SPR PRN practitioners felt limited by the necessity to identify only one problem. We therefore amended HoNOS to allow practitioners to rate the main other mental and behavioural problem but, in addition, also to identify up to four supplementary problems from the list.
Representatives from each service attended a HoNOS training session provided by staff from the Research Unit of the Royal College of Psychiatrists. The representatives then shared their training information with their colleagues in each service. All participating therapists were also provided with a HoNOS glossary containing rating instructions. Practitioners completed HoNOS at two time points: at first assessment and at discharge. HoNOS was completed at first assessment to assess its suitability as a profiling tool, and at discharge to test its ability to provide outcome information.
HoNOS data sets
Within the eight participating NHS out-patient and community psychotherapy services, 65 psychotherapists provided HoNOS assessment data and 55 therapists provided discharge data. The same therapist rated HoNOS at assessment and discharge for 239 (66%) cases. Assessment data were recorded for 1688 patients, with HoNOS being fully complete for 76% of these cases. A total of 1166 patients were accepted for therapy, and discharge forms were completed for 362 (31%) of these patients. Fully complete pre- and post-HoNOS data were recorded for 208 patients (18%). A further 85 patients with discharge data only were not included in analysis.
Additional clinical and service information
HoNOS was incorporated into an existing evaluation protocol, designed by the network practitioners to gather clinical and service data. At assessment, additional information regarding waiting times from referral to assessment, patient demographics, previous and current therapy, reason for referral (qualitative data) and assessment outcome was also recorded. At discharge, additional data concerning waiting times for therapy, number of sessions attended, therapy type, modality and frequency, therapy ending, benefits of therapy (qualitative data) and follow-up arrangements were collected. These data are not reported in detail in this paper.
RESULTS
Assessment
Practitioners completed HoNOS after an average of one assessment. Figure 1 shows the distribution of ratings for each HoNOS item at assessment using boxplots (Reference NorusisNorusis, 1992). (For reference, HoNOS item descriptors are given in the Appendix.)
The data shown from HoNOS are not ideal for this form of display, as there are no intermediate values between the integers. Ratings for items 3, 4, 5, 6 and 11 are extremely limited, with all ratings other than 0 identified as outliers. Ratings for items 1, 2, 10 and 12 are limited, having a median rating of 0 and a highest rating of 2, excluding outliers, and could be used only for very limited purposes. Items 7, 8 and 9, however, show reasonable distributions, with the middle of 50% of ratings ranging from 1 to 3, and having lowest and highest values of 0 and 4 respectively. Because of missing data, n for each item presented in Figure 1 varies. All analyses were repeated only with patients having a fully complete HoNOS (n=1279), but no substantial differences were found by restricting the sample in this way.
Table 1 shows the mean total ratings for the four HoNOS sub-scales, A, B, C and D, and shows a clear rank order with Symptomatic Problems being the highest (owing to the high ratings of items 7 and 8), followed by Social Problems, then Behavioural Problems and finally Impairment. The mean total HoNOS rating was 8.93 (s.d.=5.21).
HoNOS sub-scale | Mean total | s.d. | Min. | Max. | n |
---|---|---|---|---|---|
A Behavioural problems | 1.57 | 1.92 | 0.0 | 9.0 | 1644 |
B Impairment | 0.70 | 1.20 | 0.0 | 7.0 | 1637 |
C Symptomatic problems | 3.84 | 2.06 | 0.0 | 11.0 | 1656 |
D Social problems | 2.94 | 2.30 | 0.0 | 13.0 | 1633 |
Table 2 shows frequencies for the Other Mental and Behavioural Problems identified in item 8. The frequencies illustrate that allowing raters to identify only one main Other Mental and Behavioural Problem restricts choice and leads to misleading profile data. When raters are permitted to identify additional problems to reflect the co-presentation of symptoms, the number of patients identified as displaying those problems rises considerably. For example, the number of patients identified as having sleep problems rises from 2.7% when only a single problem can be identified, to 16.6% when raters are permitted to identify co-presenting problems.
Other mental and behavioural problem | Main problem | Main problem+additional problems | ||
---|---|---|---|---|
n | % | n | % | |
Phobic | 111 | 6.6 | 218 | 12.9 |
Anxiety | 552 | 32.7 | 773 | 45.8 |
Obsessive-compulsive | 76 | 4.5 | 120 | 7.10 |
Stress | 144 | 8.5 | 368 | 21.8 |
Dissociative | 35 | 2.1 | 150 | 8.9 |
Somatoform | 45 | 2.7 | 169 | 10.0 |
Eating | 78 | 4.6 | 261 | 15.5 |
Sleep | 46 | 2.7 | 280 | 16.6 |
Sexual | 58 | 3.4 | 215 | 12.7 |
Other | 90 | 5.3 | 208 | 12.3 |
Discharge
Of the 1166 patients accepted for therapy, discharge forms were completed for 362 (31%). In order to establish whether the discharge subsample was significantly different from the full assessment sample, t-tests between the assessment means of both samples were conducted. The assessment mean total score for the discharge subsample was significantly higher than that for the full assessment sample (t=2.28, d.f.=361, P=0.023). Item 2 (t=-2.33, d.f.=346, P=0.021), item 7 (t=-4.21, d.f.=358, P<0.0005), item 8 (t=-2.86, d.f.=327, P=0.004) and item 12 (t=3.27, d.f.=334, P=0.001) showed differences at assessment between the full assessment and discharge populations. This means that the discharge subsample is not necessarily representative of the whole sample, as it excludes patients still in treatment and those seen for assessment only, as well as therapists who fail to complete forms.
Figure 2 shows mean HoNOS item scores at assessment and at discharge for patients having both assessment and discharge ratings. The n for each item varies owing to missing data. Again, all analyses were repeated with cases having complete HoNOS data (all 12 items), and no substantial differences were found by being selective in this way. A reduction in mean scores can be seen for all items, with the greatest reduction seen in items 7, 8 and 9. To reinforce Figure 2, mean assessment and discharge scores are displayed in Table 3 along with change scores, confidence intervals, outcome effect sizes and t-test results. Items 4, 5 and 11 shows the least change. Outcome (i.e. pre—post) effect sizes are a useful means of interpreting the extent of change between assessment and discharge scores, and are calculated by dividing the change score by the standard deviation of the assessment score. Change on the full HoNOS is statistically significant (t=13.18, d.f.=308, P<0.0005) with a moderate pre—post effect size of 0.69 standard deviation units. Only items 4, 5 and 11 showed change scores that were not significant at the P<0.05 level.
Item | n | Mean (s.d.) | Change score1 | 95% CI change | Paired samples t-test | Outcome effect size |
---|---|---|---|---|---|---|
1 a | 291 | 0.73 (0.92) | 0.33 | 0.219 to 0.427 | t=6.14, d.f.=290, P<0.0005 | 0.36 |
d | 0.40 (0.71) | |||||
2 a | 293 | 0.73 (1.09) | 0.43 | 0.320 to 0.547 | t=7.52, d.f.=292, P<0.0005 | 0.39 |
d | 0.30 (0.70) | |||||
3 a | 286 | 0.40 (0.86) | 0.10 | 0.017 to 0.179 | t=2.37, d.f.=285, P=0.019 | 0.12 |
d | 0.30 (0.74) | |||||
4 a | 292 | 0.26 (0.66) | 0.07 | -0.016 to 0.147 | t=1.57, d.f.=291, P=0.117 | 0.09 |
d | 0.20 (0.61) | |||||
5 a | 289 | 0.53 (1.0) | 0.07 | -0.024 to 0.169 | t=1.48, d.f.=288, P=0.139 | 0.07 |
d | 0.46 (1.0) | |||||
6 a | 295 | 0.12 (0.52) | 0.06 | 0.011 to 0.111 | t=2.43, d.f.=294, P=0.016 | 0.12 |
d | 0.06 (0.35) | |||||
7 a | 301 | 2.22 (1.02) | 1.09 | 0.953 to 1.23 | t=15.71, d.f.=300, P<0.0005 | 1.01 |
d | 1.13 (1.06) | |||||
8 a | 254 | 2.34 (1.26) | 1.02 | 0.858 to 1.173 | t=12.69, d.f.=253, P<0.0005 | 0.81 |
d | 1.32 (1.16) | |||||
9 a | 291 | 2.00 (1.22) | 0.52 | 0.389 to 0.656 | t=7.70, d.f.=290, P<0.0005 | 0.43 |
d | 1.48 (1.16) | |||||
10 a | 286 | 0.56 (0.97) | 0.16 | 0.052 to 0.277 | t=2.97, d.f.=285, P<0.004 | 0.16 |
d | 0.40 (0.78) | |||||
11 a | 282 | 0.22 (0.52) | 0.04 | -0.011 to 0.096 | t=1.55, d.f.=281, P=0.122 | 0.08 |
d | 0.18 (0.43) | |||||
12 a | 283 | 0.40 (0.82) | 0.20 | 0.109 to 0.286 | t=4.41, d.f.=282, P<0.0005 | 0.24 |
d | 0.20 (0.53) | |||||
Total a | 309 | 9.98 (5.54) | 3.83 | 3.26 to 4.40 | t=13.18, d.f.=308, P<0.0005 | 0.69 |
d | 6.15 (5.23) |
Assessment and discharge total scores for each patient can also be summarised in terms of reliable and clinically significant change (Reference Evans, Margison and BarkhamEvans et al, 1998). Reliable change (i.e. change that is not due to chance or measurement error) was calculated as any change greater than 1.96 times the standard error of difference. The threshold for clinical change was calculated as the mean total assessment score plus the mean total discharge score, halved. More sophisticated methods of determining clinical cut-off points which take into account clinical and non-clinical norms are available (Reference Evans, Margison and BarkhamEvans et al, 1998), but lack of HoNOS normative data in non-clinical and well populations prevented use of such methods.
Table 4 summarises the reliable and clinically significant change results. For over half the sample (52%), HoNOS ratings did not change to either a statistically reliable or a clinically significant extent. Almost a quarter (24%) of patients' scores reduced to an extent that indicated both reliable and clinically significant improvement, and only 2% showed both reliable and clinically significant deterioration.
Reliable deterioration | No reliable change | Reliable improvement | Total | |
---|---|---|---|---|
Clinical deterioration | 7 (2%) | 5 (2%) | 0 (0%) | 12 (4%) |
No clinical change | 5 (2%) | 161 (52%) | 35 (11%) | 201 (65%) |
Clinical improvement | 0 (0%) | 23 (7%) | 73 (24%) | 96 (31%) |
Total | 12 (4%) | 189 (61%) | 108 (35%) | 309 (100%) |
DISCUSSION
Does HoNOS provide an adequate assessment of psychotherapy patients' problems?
In response to this first question, our results suggest that the answer is at best equivocal and at worst negative. Analysis of HoNOS assessment ratings suggests that this instrument does not provide adequate coverage of the range of problems presented by psychotherapy patients. Extremely low assessment ratings for all items except items 7, 8 and 9 indicate that the majority of HoNOS items are irrelevant and inappropriate for an out-patient psychotherapy population. This reflects the overriding importance of HoNOS being useful in the enduring, severe mental illness category. These patients are likely to have complex care plans, reflecting difficulties in social and occupational functioning, although the severity of disorder itself may be comparable to that of out-patients treated in psychotherapy services.
Conversely, important problem areas such as self-esteem and post-traumatic stress disorder, which are highly relevant to psychotherapy patients, are not included. Items 7, 8 and 9 (Problems with Depressed Mood, Other Mental and Behavioural Problems, and Problems with Relationships) do provide a reasonable level of profiling at assessment, with the Other Mental and Behavioural Problems in item 8 being particularly pertinent to psychotherapy patients. However, item 8 results show that there is a clear need for psychotherapists to be able to record co-presenting problems, rather than only the ‘main’ problem, as in the current HoNOS guidelines. When practitioners are able to identify more than one problem from the list of 10, item 8 becomes much more useful for assessment profiling. Anxiety was the most prevalent problem in item 8, being identified as an issue for almost half the sample (45.8%). It could be argued that anxiety should be a separate item which practitioners could rate more fully on the 0-4 scale. The inadequacy of item 8 to record sufficient detail regarding Other Mental and Behavioural Problems has also been reported in a recent validation study on HoNOS with adult psychiatric patients (Reference McClelland, Trimble and FoxMcClelland et al, 2000).
The high numbers of missing data further limit the ability of HoNOS to provide assessment profile information. Missing data indicate that the information required for the rating was either not known, not applicable, or that the practitioner was unsure how to rate because of difficulty in understanding the guidelines. The HoNOS guidelines may be a contributing factor to the low ratings, as the wording leans towards definitions drawn from patients who are likely to have complex care plans. The definition of problems relating to occupation and other activities in item 10, for example, is worded in such a way that few patients treated in out-patient settings might be expected to meet the criteria for having even mild problems in this area.
Is HoNOS sensitive to changes following treatment?
In response to the second question, our results suggest that HoNOS is severely limited in its ability to detect and record clinically meaningful change. Its insensitivity to change is evident for the majority of items, with the exception of items 7, 8 and 9. The minimal change for most items is due to low assessment ratings, leaving little scope for a further reduction in rating. Items 7, 8 and 9 do appear to be useful for identifying change, as shown by high effect sizes and t-values in Table 3. The value of the total score for showing change is uncertain. There is sufficient variability for the measure to be used, but cut-off scores between clinical and non-clinical reference populations are not available.
In terms of placing the present sample in comparison with other samples reported in the literature, the mean shown for the present sample can be compared to mean totals from other studies with psychiatric populations. A mean of 9.98 was reported in the original reference sample (Reference Wing, Beevor and CurtisWing et al, 1998). Information is also available by diagnostic group, ranging from 8.44 for alcohol/drug-related disorders to 12.81 for borderline personality disorder in the Victorian field trial (Reference TrauerTrauer et al, 1999). In the Camberwell evaluation of severe mental illness the mean total was 12.0 (Reference Slade, Beck and BindmanSlade et al, 1999). However, in out-patient community samples, much lower means of 8.5 (Reference Orrell, Yard and HandysidesOrrell et al, 1999) and 8.13 (Reference McClelland, Trimble and FoxMcClelland et al, 2000) have been found. Hence, the mean value found in the psychotherapy and psychological treatment settings sampled in the present study was directly comparable to samples of psychiatric patients treated in the community but lower than the mean for samples with severe mental illness.
Is HoNOS appropriate for informing and guiding service development?
Regarding the third question, our findings suggest that ‘the jury is still out’. The use of HoNOS by the practitioners in the SPR PRN identified several limitations in its suitability for out-patient psychotherapy and psychological treatment services, particularly regarding its profiling ability and sensitivity to change. Consequently, it is limited in its capacity to inform and guide service delivery, for example, in prioritising referrals or aiding discharge. Ultimately, the suitability of HoNOS in its current form for informing psychotherapy service development is questionable.
Limitations of the current study
The current study was limited in that, while all practitioners received basic training in HoNOS completion, the level of interrater reliability was not formally established. The HoNOS field trials used mainly nurses and psychiatrists to test interrater reliability, with clinical psychologists accounting for 1.5% of those practitioners included. Psychotherapists were not included in the trials (Reference Wing, Curtis and BeevorWing et al, 1999). Establishing interrater reliability would have provided useful information both on the reliability of the psychotherapists' ratings for this study, and as a sample of the profession more widely.
However, obtaining interrater reliability data is not particularly compatible with using HoNOS in everyday clinical practice. First, its utility is going to have to be accepted — or not — on the basis of how it is most likely to be used in routine practice. Second, the use of a second standardised measure alongside HoNOS would have provided useful information on the concurrent validity of HoNOS in psychotherapy populations, as well as providing comparison profile and change data. Third, data attrition meant that the discharge data sample was restricted in size to less than a third (31%) of patients accepted for therapy. Reasons for the low return of discharge forms include therapist forgetfulness, practicalities of completing the forms in busy practice settings, patients not attending or dropping out of therapy, long waiting times and long therapy contracts. Finally, time taken to complete HoNOS was not formally recorded in the current study but would have provided additional information on the appropriateness of HoNOS for routine use in psychotherapy settings.
Implications for the development of HoNOS
If HoNOS is to be taken on by psychotherapy services, further tests of its suitability need to be conducted. The limited range of field trials to date, focusing mainly on patients with severe mental illness, is recognised by the developers (Reference Wing, Beevor and CurtisWing et al, 1998). The SPR PRN would welcome such trials, but would also strongly encourage the development of a specific psychotherapy version of HoNOS, or an alternative measure which taps into a wider range of psychological difficulties to allow accurate profiling and outcomes measurement for psychotherapy patients. As an interim measure we would recommend recording multiple responses under item 8 (Other Mental and Behavioural Problems), as recording only single items markedly reduces the apparent frequency of common presenting problems. We conclude that it is possible to record such supplementary data without altering the main instrument.
APPENDIX
-
A Behavioural problems
-
1 Overactive, aggressive, disruptive or agitated behaviour
-
2 Non-accidental self-injury
-
3 Problem-drinking or drug-taking
-
-
B Impairment
-
4 Cognitive problems
-
5 Physical illness or disability problems
-
-
(C) Symptomatic Problems
-
6 Problems associated with hallucinations and delusions
-
7 Problems with depressed mood
-
8 Other mental and behavioural problems
-
-
(D) Social problems
-
9 Problems with relationships
-
10 Problems with activities of daily living
-
11 Problems with living conditions
-
12 Problems with occupation and activities
(0, no problem; 1, minor problem requiring no action; 2, mild problem but definitely present; 3, moderately severe problem; 4, severe to very severe problem)
-
Clinical Implications and Limitations
CLINICAL IMPLICATIONS
-
▪ For psychiatrists practising psychotherapy, Health of the Nation Outcome Scales (HoNOS) have restricted sensitivity as an instrument for sole use in routine practice.
-
▪ The restrictions imposed by the wording of item 8 of HoNOS can be ameliorated by allowing practitioners to rate the presence of additional symptoms and problems.
-
▪ Reliable and clinically significant change can be assessed, but only the total scores have enough variability to be used routinely.
LIMITATIONS
-
▪ No ratings of interrater reliability were made.
-
▪ High levels of data attrition occurred at follow-up.
-
▪ The use of integers on a 0-4 scale restricts the analyses possible at the item level.
ACKNOWLEDGEMENTS
Thanks are due to practitioners of the Society for Psychotherapy Research (Northern) UK Practice Research Network.
eLetters
No eLetters have been published for this article.