

The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 59:M378-M384 (2004)
© 2004 The Gerontological Society of America
Case-Finding for Depression in Elderly People: Balancing Ease of Administration With Validity in Varied Treatment Settings
Karen Blank1,2,,
Cynthia Gruman1,2 and
Julie T. Robison1,2
1 Braceland Center for Mental Health and Aging, Institute of Living, Hartford Hospital's Mental Health Network, Connecticut.
2 University of Connecticut School of Medicine, Farmington.
Address correspondence to Karen Blank, MD, MA, Braceland Center for Mental Health and Aging, Institute of Living, Hartford Hospital's Mental Health Network, 200 Retreat Ave., Hartford, CT 06106. E-mail: kblank{at}harthosp.org
 |
Abstract
|
|---|
Background. Little is known about the performance of brief and ultrabrief (1- and 2-question) depression screens in older patients across varied treatment sites. This study (1) assesses their validity in clinics, hospitals, and nursing homes and (2) assesses cut-points for optimal clinical application.
Methods. 360 patients aged 60 years and older from 2 urban primary care practices (n = 125), 1 general hospital (n = 150), and 8 nursing homes (n = 85) were assessed using the Yale 1-question screen, the 2-question instrument derived from the Primary Care Evaluation of Mental Disorders, and long and short versions of the Center for Epidemiologic Studies Depression (CES-D) scale and Geriatric Depression Scale (GDS). Sensitivity and specificity were calculated for each screen compared with the criterion standard Diagnostic Interview Schedule (DIS) depression diagnosis and receiver operating characteristic curves generated.
Results. 9% of patients met DIS criteria for major depression and 7% for subsyndromal depression. Overall, the 10-item CES-D showed the best sensitivity/specificity for major depression in clinics (79%/81%) and hospitals (92%/77%), and the short GDS in nursing homes (86%/82%). Specificity of 1- and 2-question instruments was generally low. Established cut-points generally worked best for the short screens, while modifications were useful for longer versions.
Conclusions. Consideration of site of use is important in selecting brief case-finding instruments for late-life depression, with the 10-item CES-D working best in medical settings and the 15-item GDS in nursing homes.
THE need for improved detection and treatment of late-life depressive disorders by primary care practitioners has been well documented (1). Increasing time pressures and changes in the health care reimbursement system have increased interest in developing brief, easily administered depression case-finding instruments. While several instruments have been found to be sensitive and specific for detecting depression in general populations, less is known about their applicability to older persons in the variety of care settings in which they are treated (2). Still less is known about whether ultrabrief, 1-, and 2-question case-finding instruments can be useful when applied to the aging population (38). With the shorter case-finding instruments comes the challenge of identifying patients with depression while at the same time minimizing the number of misclassified cases (9).
Studies of the validity of brief depression case-finding instruments involving older patients have yielded inconsistent findings. Lyness and colleagues (10) found the Center for Epidemiologic Studies Depression Scale (CES-D) and Geriatric Depression Scale (GDS) (including a shortened GDS) useful for detecting major depression but not minor depression in older medical outpatients. Irwin (2) found that the 10-item CES-D showed excellent properties for detecting depression in an older community sample. However, the validity of the CES-D, specifically its ability to differentiate between depressive, anxiety, or other psychiatric disorders, has been called into question (11). In addition, the optimal CES-D cut-point has been a matter of debate (10,12). The GDS, in both its long and shortened form, has been recommended for use in the nursing home (13,14). Whooley found that the 2-question instrument, which assesses depressed mood and anhedonia (8), performed well for primary care patients (sensitivity/specificity 96%/57%, respectively), but this study included only 6 depressed patients older than 65 years (8). Likewise, the Yale 1-question screen, "Do you often feel sad or depressed?" yielded a 69% sensitivity and 90% specificity for depressive disorder, but the study's findings were limited by small sample size (6).
The specific aims of this study were 1) to compare the performance of brief depression case-finding instruments and to determine whether the 1- and 2-question instruments are useful in elderly patients; 2) to evaluate their performance in diverse sites including ambulatory medical clinics, hospitals, and nursing homes; 3) to determine whether modification of established cut-points improves case-finding for major depression and significant subsyndromal depression.
 |
METHODS
|
|---|
Participants
Participants included patients in 2 primary care clinics, 1 general medical hospital, and residents in 8 nursing homes in Connecticut. Before participation, patients were fully informed of the protocol and provided written informed consent. Researchers approached all eligible people for study participation in each setting on the days of data collection (n = 407). Exclusion criteria included a Mini-Mental State Examination (MMSE) score of less than 24 (15), the presence of psychosis, and a high index of suspicion of alcoholism (CAGE) (16). Thirty-five patients were ineligible and 12 refused to participate. Three hundred sixty persons completed interviews: 125 clinic patients, 150 medically hospitalized patients, and 85 nursing home residents.
Measures
A structured interview of approximately 50 minutes' duration was administered to patients aged 60 years or older by 9 individuals who underwent standardized training in its administration (2 physicians, 2 psychologists, 2 social workers, and 3 advance practice registered nurses). Patients began the protocol by writing their responses to the Yale 1-question ("Are you depressed?") (6) and the 2-question case-finding instrument: "During the past month, have you often been bothered by feeling down, depressed, or hopeless?" and "During the past month have you been bothered by little interest or pleasure in doing things?" (8) Interviewers were blinded to these results. Interviewers administered the GDS (17) and the CES-D scale (18). Short versions of the GDS (15-item) and CES-D (10-item) (2) were extracted and scored during data analysis. The order of administration of case-finding instruments was rotated to eliminate ordering effects.
The mood sections of Diagnostic Interview Schedule (DIS) were then administered to establish the presence of depression by the criterion standard diagnoses. The Quick-DIS (Q-DIS) was used to identify comorbid psychiatric illness (19). Medical illness burden was quantified upon medical chart review using the Cumulative Illness Rating Scale, Geriatric version (CIRS-G) (20). Patients rated their physical health and completed interviewer administered Activities of Daily Living (ADL) and Instrumental Activities of Daily Living (IADL) measures (21,22). Demographic information included age, gender, marital status, race, years of education, type of housing, and living arrangement. The interview generated a global assessment of functioning score (GAF) (23) and contained a number of psychosocial measures, which are not presented in the current analyses. Patients who scored 21 or higher on the full-length CES-D were referred to their primary care provider for follow-up and further mental health evaluation. All data was coded and entered into Microsoft Access 97 (Redland, WA). After data cleaning and checking, they were imported into and analyzed in SPSS 10.0 (SPSS, Inc., Chicago, IL) (24).
Analyses
The analytic approach of this study involved three stages. First, descriptive statistics characterized the study participants' sociodemographic, mental health, and health characteristics. Next, sensitivity and specificity for each of the 4 depression case-finding instruments (and CES-D and GDS short versions) were calculated compared with DIS diagnoses of major depression. The entire analysis was repeated with depression broadened to encompass subsyndromal depression (DIS-diagnosed dysthymia and "symptoms falling short of major depression"). Third, each of the case-finding instruments were converted to their continuous or ordinal scale, and receiver operating characteristic (ROC) curves were generated to visualize its sensitivity and specificity (plotting sensitivity versus 1 specificity) of depression scores. The area under the curve was measured to compare the diagnostic value of the depression case-finding instruments for each measure of depression. A test that correctly classifies all participants has an area of 1.0, and a test with no discriminatory value has an area of 0.5 or less. Optimal cut-point values determined by the ROC analysis for detection of major depression were derived using cumulative percentages of the scores for both the depressed and nondepressed groups.
 |
RESULTS
|
|---|
Sample Description
The 360 participants had a mean age of 77 years (range 6096 years), 63% were female and 37% were male, the majority were widowed, divorced, or separated, lived with others, and were educated until 12th grade or beyond. Subjective health was rated as good to excellent by 59%, poor by 41%, and CIRS-G severity index averaged 2 (range 04). Average functioning was generally fairly intact as indicated by an average GAF of 77 (SD [standard deviation] 14). The majority of patients (63%) were functional in all ADLs, while only one third (33%) were independent in all IADLs. Further demographic sample descriptions by site can be found in Table 1.
According to the criteria standard DIS, 9% of the total sample had a current major depressive disorder (MDD), and an additional 7% demonstrated subsyndromal depression. The rates for active major depression by site (single episode and recurrent) were 11% in the clinic, 8% in the hospital, and 9% in the nursing home. Subsyndromal depression was present in 10% of clinic patients, 3% of hospitalized patients, and 8% of nursing home residents (Table 2). Of the total sample, 161 participants (45%) met criteria for 1 or more additional psychiatric disorders, most often an anxiety disorder according to the Q-DIS.
Using recommended scoring cut-points for depression, the rates of positive case-finding instruments for the total sample ranged from a high of 45.6% (2-question instrument) to a low of 17.8% (CES-D 20-item) (data not shown). The sensitivity and specificity findings of each scale at each site compared to the DIS diagnosis are presented in Table 3 (clinic), Table 4 (hospital), and Table 5 (nursing home). The case-finding instruments' adequacy varied depending on the site. The CES-D and GDS showed equivalent sensitivity (79%) for detecting major depression in clinic populations, but the 10-item CES-D's specificity was greater (81%). The 10-item CES-D also performed better than the other instruments in the hospitalized population. In the nursing home, the shortened GDS performed best (86%/82% sensitivity/specificity).
The ultrabrief 1- and 2-question case-finding instruments also differed in case-finding properties depending on the site in which they were administered. Both ultrabrief case-finding instruments performed best with hospitalized patients with the Yale 1-question showing 83% sensitivity and specificity and the 2-question instrument showing 92% sensitivity and 54% specificity.
Cut-points were evaluated by ROC curve interpretation. Recommended cut-points for the detection of MDD by site are displayed in Tables 35. When the recommended cut-points were applied to the study population, results demonstrated that, among the clinic population, the long version of the GDS (standard cut-point of 10) incorrectly identified 37 of 125 patients as depressed. After adjusting the cut-point to 17, only 14 patients were incorrectly classified as depressed. Results were similar for the nursing home population. A modification of the cut-point from 17 to 13 on the GDS long version resulted in a decrease of misclassifications from 22 of 85 patients to 12 patients. Cut-point variations for the hospital population revealed the greatest variations for the both the long GDS and the long CES-D, though in opposite directions. Shifting the cut-point on the GDS from 10 to 15 resulted in a decrease in misclassifications of depression from 31 to 10. In contrast, when modifications were applied to the CES-D long version (modifying the cut-point from 16 to 14) misclassifications of depression increased from 33 to 41 of 150 patients; however, correct classifications of depression improved from 9 to 12 patients. The shortened CES-D and GDS did not benefit from adjustments of their cut-points with the exception of the improved sensitivity of the 10-item CES-D (93%), but diminished specificity (68%) in the clinic when its cut-point was lowered to 2.
When a broadened definition of depression was used, combining subsyndromal MDD, the sensitivity of the instruments generally improved compared with their case detection for MDD alone. The instruments still showed different performance by site, but less so. The shortened CES-D and GDS were roughly equivalent for detection of combined subsyndromal depression and MDD in the clinic and hospital, and the GDS remained best in the nursing home (Table 6).
 |
DISCUSSION
|
|---|
The two major findings of this study were that the validity of case-finding instruments varied according to the site in which they were applied, and that the 1- and 2-question screens' low specificities may limit their use. Selection of a case-finding instrument requires consideration of whether patients are in a medical setting (ambulatory or hospital) versus in a nursing home. A "one-size-fits-all" screen was not identified. This is likely due to high rates of psychiatric and medical comorbidities, the psychosocial/environmental effects of varied care settings, and other factors contributing to the enormous variability of older persons and their expression of depressive illness.
The shortened forms of the CES-D and the GDS showed the best test performance overall. Among the nursing home residents, the GDS 15-item showed the best combination of sensitivity and specificity. This supports Gerety and colleagues' findings of the superiority of the GDS over the CES-D in the nursing home population (13). Among the many possible explanations for the superiority of the GDS might be its de-emphasis of somatic symptoms and increased attention to subjective, psychological expressions (17). Perhaps because of nursing home residents' more chronic and disabling physical illnesses, weighing somatic complaints lacks sufficient uniqueness in distinguishing the nondepressed from depressed.
Despite the attractiveness of the brevity of 1- and 2-question case-finding instruments, their general problems with specificity creates drawbacks that limit their widespread application. While the 2-question instrument's sensitivity was comparable to most of the longer case-finding instruments, it was positive in nearly half of the total sample. The low specificity of both ultrabrief case-finding instruments (with the exception of the Yale 1-question in hospitalized patients) require that more patients receive follow-up psychiatric evaluations. In those fortunate settings where resources are available, or in circumstances where rapid detection is essential, the 2-question tool may be a reasonable choice as a first-pass, case-finding instrument. An argument could be made that either ultrashort screen would be a reasonable choice for hospitalized patients, where a 1- or 2-question screen might be most suitable for use at the bedside. The 2-question instrument showed superior sensitivity compared with the Yale 1-question indicating that the addition of a question about anhedonia detects additional depressed patients, but at the cost of specificity.
Our findings demonstrate that setting is a central consideration in the selection of a screening instrument in a number of ways; the predicted prevalence of depression in the setting, the goals of screening, and the clinical resources of the setting all contribute to selection of scales and of their cut-points. Selecting a cut-off point results in a trade-off between sensitivity (making sure that depressed people are correctly identified, that false negatives are reduced) and specificity (making sure that nondepressed patients score below the cut-off, that false-positives are reduced). Most clinical settings understandably seek to maximize sensitivity. From a clinical perspective, case-finding depressed persons and minimizing missed cases would appear to be prioritized, but it is often nearly as important to control the number of nondepressed persons (minimize the false positive rates) in environments where medical and psychiatric resources are strained. The base rate or prevalence of a depression in a given population will influence the numbers of persons who score false-positive and false-negative: For example, in settings where depression is uncommon, fewer people will risk scoring as false-negative. The challenge particularly occurs in high prevalence settings, such as those in this study. Raising the cut-point has the desired effect of diminishing false-positives, thus diminishing resource demands, but comes at the cost of greater numbers of missed cases. The decision to choose a case-finding instrument cut-point for use in a given clinical setting requires that the goals of screening and the implications of missing clinical cases be balanced with resource considerations.
None of the case-finding instruments showed impressive concordance with the criteria standard DIS diagnosis of major depression. Even the case-finding instruments showing the best combinations of sensitivity and specificity generally did not show the desired sensitivity (i.e., 90%100%) and would miss many depressed patients. There are many possible explanations for this finding. The limitations of case-finding instruments may be because they rely on self-report and lack clinician observational data, or because of flaws in the structure and/or choice of questions asked. Additionally, the low specificity may indicate that other psychopathology, especially anxiety, is driving the positive response rate (11,25). Our high rates of both psychiatric and medical comorbidity suggest that this may be contributing. In addition, the findings call into question some of the categorical divisions within Diagnostic & Statistical Manual (DSM) mood disorder diagnoses. Evidence increasingly suggests that MDD criteria may be too narrowly focused for elderly patients. Depressive symptoms not meeting criteria are prevalent in older populations and are associated with medical comorbidity, functional impairment, and responsiveness to pharmacologic treatment (12,26,27).
The study had several limitations. First, we narrowly defined subsyndromal depression by adhering to verifiable DIS DSM IV diagnoses, recognizing that research literature has not yet reached concordance on how to better define this clinical area. Another limitation is that we did not perform discrete tests of interrater reliability. However, all interviewers were trained using the same trainer and protocol, and the component parts of the structured interview have been shown to have good interrater reliability in the literature. Our finding regarding the superiority of the GDS for use in the nursing home may not be representative of the wider range of nursing home residents since we required participants to be relatively cognitively intact (MMSE >24). While this caveat applies to the total sample, the exclusion of cognitively impaired participants at nursing home sites caused a more highly selected, and thus possibly skewed, sample. Finally, the length of the interview, disclosed during the informed consent process, likely dissuaded some from participation, and this group may have included more depressed persons.
Case-finding instruments have generally proved useful and valid compared with formal diagnostic criteria and perform better than informal methods or clinical impressions (28). But it must be noted that case-finding tools were not developed to elicit clinical diagnoses but rather to identify patients with clinically significant symptoms requiring additional evaluation. It is unclear whether all older people in primary care settings should be routinely screened for depression; some research suggests that widespread screening does not improve outcome (29). The greatest benefit may result from case-finding among selected populations considered at risk for depressive disorder, but how to operationalize such a risk group remains to be investigated. Our data reveal that it is important to consider the site when selecting a case-finding instrument for geriatric patients. Additionally, questions are raised about the adequacy of current categorical diagnostic criteria against which these case-finding instruments are measured. Viewing depressive symptoms along a continuum may best benefit older patients in identifying clinically relevant symptoms. Diagnostic criteria for depressive illnesses, their derivative case-finding instruments, and their threshold cut-points must be further refined for optimal use in older patients.
 |
Acknowledgments
|
|---|
Funded by a grant to Dr. Blank from the Institute of Living Department of Psychiatry (Grant #126027). An earlier version of this article was presented at the Gerontological Society of America's annual meetings in San Francisco, November 1999. The authors wish to thank Denise Fogel, MA, for her data collection efforts and meticulous data management and Julie Fenster, MPH, for statistical support.
Received January 16, 2003
Accepted February 21, 2003
 |
References
|
|---|
- Hirschfeld RM, Keller MB, Panico S, et al. The National Depressive and Manic-Depressive Association consensus statement on the undertreatment of depression. JAMA.. 1997;277:333-340.[Abstract/Free Full Text]
- Irwin M, Artin KH, Oxman MN. Screening for depression in the older adult: criterion validity of the 10-item Center for Epidemiological Studies Depression Scale (CES-D). Arch Intern Med.. 1999;159:1701-1704.[Abstract/Free Full Text]
- Brody DS, Hahn SR, Spitzer RL, et al. Identifying patients with depression in the primary care setting: a more efficient method. Arch Intern Med.. 1998;158:2469-2475.[Abstract/Free Full Text]
- Chochinov HM, Wilson KG, Enns M, et al. "Are you depressed?" Screening for depression in the terminally ill. Am J Psychiatry.. 1997;154:674-676.[Abstract]
- Howe A, Bath P, Goudie F, et al. Getting the questions right: An example of loss of validity during transfer of a brief screening approach for depression in the elderly. Int J Geriatr Psychiatry.. 2000;15:650-655.[Medline]
- Mahoney J, Drinka TJ, Abler R, et al. Screening for depression: single question versus GDS. J Am Geriatr Soc.. 1994;42:1006-1008.[Medline]
- Rost K, Burnam MA, Smith GR. Development of screeners for depressive disorders and substance disorder history. Med Care.. 1993;31:189-200.[Medline]
- Whooley MA, Avins AL, Miranda J, et al. Case-finding instruments for depression. Two questions are as good as many. J Gen Intern Med.. 1997;12:439-445.[Medline]
- Leon AC, Portera L, Olfson M, et al. Diagnostic errors of primary care screens for depression and panic disorder. Int J Psychiatry Med.. 1999;29:1-11.[Medline]
- Lyness JM, Noel TK, Cox C, et al. Screening for depression in elderly primary care patients. A comparison of the Center for Epidemiologic Studies-Depression Scale and the Geriatric Depression Scale. Arch Intern Med.. 1997;157:449-454.[Abstract/Free Full Text]
- McQuaid JR, Stein MB, McCahill M, et al. Use of brief psychiatric screening measures in a primary care sample. Depress Anx.. 2000;12:21-29.
- Hybels CF, Blazer DG, Pieper CF. Toward a threshold for subthreshold depression: An analysis of correlates of depression by severity of symptoms using data from an elderly community sample. Gerontologist.. 2001;41:357-365.
- Gerety MB, Williams JW, Jr, Mulrow CD, et al. Performance of case-finding tools for depression in the nursing home: influence of clinical and functional characteristics and selection of optimal threshold scores. J Am Geriatr Soc.. 1994;42:1103-1109.[Medline]
- McGivney SA, Mulvihill M, Taylor B. Validating the GDS depression screen in the nursing home. J Am Geriatr Soc.. 1994;42:490-492.[Medline]
- Folstein MF, Folstein SE, McHugh PR. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res.. 1975;12:189-198.[Medline]
- Ewing JA. Detecting alcoholism: the CAGE questionnaire. JAMA.. 1984;252:1905-1907.[Abstract/Free Full Text]
- Yesavage JA, Brink TL, Rose TL, et al. Development and validation of a geriatric depression screening scale: a preliminary report. J Psychiatr Res.. 1982;17:37-49.[Medline]
- Radloff LS. The Center for Epidemiological Studies Depression scale (CES-D): a self-report depression scale for research in the general population. Appl Psychol Meas.. 1992;7:343-351.
- Bucholz KK, Robins LN, Shayka JJ, et al. Performance of two forms of a computer psychiatric screening interview: version I of the DISSI. J Psychiatr Res.. 1991;25:117-129.[Medline]
- Miller MD, Paradis CF, Houck PR, et al. Rating chronic medical illness burden in geropsychiatric practice and research: application of the Cumulative Illness Rating Scale. Psychiatry Res.. 1992;41:237-248.[Medline]
- Katz S FA, Moskowitz RW, Jackson BA, et al. Studies of illness in the aged. The index of ADL: a standardized measure of biological and psychosocial function. JAMA.. 1963;185:94-99.
- Lawton MP, Brody EM. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist.. 1969;9:179-186.[Medline]
- APA. Diagnostic and Statistical Manual of Mental Disorders: Fourth Edition (DSM-IV). Washington, DC: American Psychiatric Association; 1994.
- SPSS for Windows. Chicago, IL: SPSS, Inc.; 1998.
- Leon AC, Portera L, Olfson M, et al. False positive results: a challenge for psychiatric screening in primary care. Am J Psychiatry.. 1997;154:1462-1464.[Abstract]
- Lyness JM, King DA, Cox C, et al. The importance of subsyndromal depression in older primary care patients: Prevalence and associated functional disability. J Am Geriatr Soc.. 1999;47:647-652.[Medline]
- Williams JW, Jr, Barrett J, Oxman T, et al. Treatment of dysthymia and minor depression in primary care: a randomized controlled trial in older adults. JAMA.. 2000;284:1519-1526.[Abstract/Free Full Text]
- Schade CP, Jones ER, Jr, Wittlin BJ. A ten-year review of the validity and clinical utility of depression screening. Psychiatr Serv.. 1998;49:55-61.[Abstract/Free Full Text]
- Whooley MA, Stone B, Soghikian K. Randomized trial of case-finding for depression in elderly primary care patients. J Gen Intern Med.. 2000;15:293-300.[Medline]