| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|
| ||||||||||||||||||||||||||||||||
1 Health Informatics Research Group, Department of Information Studies
2 Centre for Health Information Management Research, University of Sheffield, United Kingdom.
Address correspondence to Dr. Peter Bath, Department of Information Studies, The University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, U.K. E-mail: p.a.bath{at}shef.ac.uk
| Abstract |
|---|
|
|
|---|
Methods. Data were obtained from a nationally representative sample of 1042 community-dwelling people aged 65+. Data on cognitive impairment, physical health, physical activity, psychological well-being, social engagement, and physical capability resulted in 460 independent variables for analysis. Outcome was time from 1985 interview to death or censorship on February 29, 2000. CoRGA was used to selected combinations of 1, 2, 4, 8, 12, and 16 variables as potential risk factors for 15-year mortality.
Results. CoRGA selected age in all six models; variables relating to handgrip strength were selected in five models; variables relating to reported chest pain were selected in four models; and pain in joints causing difficulty in carrying bags and self-rated activity compared to peers were both selected in three models. Other variables selected by CoRGA included time since last visited the dentist and optician, use of hypnotic drugs, and number of prescribed drugs being taken.
Conclusions. CoRGA confirmed current risk factors for long-term mortality among older people and identified new risk factors. Age was confirmed as the most important predictor of mortality in older people. Handgrip strength is an important marker of frailty in predicting mortality. Self-rated activity is an important predictor of long-term mortality.
The precise importance of specific risk factors is unclear for several reasons. First, different studies have measured different attributes as potential risk factors for mortality. Second, for reasons of statistical power and computational complexity, only a limited number of potential risk factors can be examined within a particular model or series of models, using conventional statistical techniques, e.g., Cox proportional hazards regression. Therefore, the researcher has to preselect, and to some extent prejudge, the variables that are included in statistical models (20). Potentially important variables and risk factors may be overlooked or ignored. Evolutionary data-mining techniques have been developed that may help overcome these problems.
Evolutionary data-mining tools, such as genetic algorithms (GAs) (21), are computational techniques that are based on the principles of Darwinian evolution, particularly those of reproduction, mutation, and selection (2123). Evolutionary tools are used to search the high dimensional space of possible solutions to a given problem to try and find an optimal solution. They are particularly useful in health and medicine due to the large numbers of variables and multivariate relationships, and they offer a means of considering all the variables in a data set (22,23). GAs have been used previously in health and medical research, including analyzing sleep patterns (24), developing prognostic systems for colorectal cancer (25), selecting features for recognizing skin tumors (26), predicting depression after mania (27) and survival after lung cancer (28), and identifying risk factors for falls among older people (20). GAs have been used as variable selection tools for predicting health outcomes in combination with Artificial Neural Networks (20,27,28), and with statistical techniques, e.g., for variable selection in logistic regression (29,30) and Cox regression (31). This article describes the use of CoRGA (Cox Regression with a Genetic Algorithm), which examines all the variables for inclusion in the model, and from these variables selects the combination that maximizes the goodness of fit in Cox regression. The aim of this study was to use CoRGA to identify combinations of risk factors for 15-year mortality within a nationally representative sample of community-dwelling older people.
| METHODS |
|---|
|
|
|---|
Survey Assessments
The structured questionnaire is described briefly below. Respondents were screened for cognitive impairment using the 12-item Information/Orientation scale from the Clifton Assessment Procedures for the Elderly (33). If the respondent failed to achieve a maximum score of 8, the interview was discontinued.
General physical health was assessed using questions covering the presence or absence of heart, abdominal, eyesight, sleep, or foot problems; giddiness, headaches, urinary incontinence, arthritis, and falls; long-term disabilities and walking aid use; and contact with medical services (34). Additional items covered smoking, mobility, and prescribed medications. Sleep problems and current prescribed psychotropic drug use were assessed. Global ratings of subjective health (35), activity, and memory were assessed.
"Customary" physical activity was assessed by dividing activities into five mutually exclusive functional categories: outdoor productive activities (e.g., gardening); indoor productive activities (e.g., housework); walking; shopping; and leisure activities (e.g., swimming). Each reported activity was scored as minutes per week (36). Noncontinuous activities likely to contribute to muscle strength (e.g., climbing stairs) and joint flexibility (e.g., reaching up high) were also included and scored in terms of frequency of performance on a 5-point scale.
Depression was assessed using the 14-item Symptoms of Anxiety and Depression scale, derived from the Delusions, Symptoms and States Inventory (37). Assessments of morale were provided by a modified version of the 13-item Life Satisfaction Index (38). The Brief Assessment of Social Engagement scale provided a measure of social participation (39). The variables comprising the depression, morale, and social engagement scales were included for consideration in the analyses, as well as the total scores.
Measurements of handgrip strength (using a specially designed strain-gauged isometric dynamometer) (40) and joint flexibility (glenohumeral abduction using a goniometer) were made to assess the individual's actual functional capabilities (32). For measurements of handgrip strength, alternate tests of right and left hands were made at intervals of at least 1 minute. A minimum of three attempts was allowed for each hand. Data for all three measurements for each hand recorded in kilograms were included in the analyses. The respondent's preferred (i.e., dominant) hand was recorded, and data for handgrip strength for dominant and nondominant hands were converted to Newtons for the analyses. Body weight and stature were measured. Measurements of demispan provided estimates of skeletal size (32).
Mortality
Information on mortality within the sample was provided by the U.K. National Health Service Central Register, where all U.K. deaths are recorded. During the 15-year period from the baseline interview in 1985 through February 29, 2000, the study received notification of 741 deaths (71%). The dependent variable used in the Cox regression part of CoRGA was time in days from 1985 interview to death or censorship on February 29, 2000.
Analyses
The development, testing (41), and operation (31) of CoRGA are described in detail elsewhere. Briefly, CoRGA was implemented by custom-writing software in MATLAB (The Mathworks Ltd., Natick, MA). MATLAB programming is based on matrix computation, and works on MATLAB workspace that can be downloaded and installed from www.mathworks.com. The genetic algorithm component of CoRGA was developed using functions available in the MATLAB toolbox. The Cox regression function of CoRGA was custom written in the MATLAB workspace. Following extensive testing of CoRGA, data from the NLSAA were transformed as described below prior to execution of CoRGA on the data set.
The actual NLSAA data consist of continuous (e.g., number of prescribed medications, prescribed psychotropic drug use), nominal (e.g., mobility), ordinal (e.g., global ratings of subjective health, activity, memory, frequency of sleep problems), and binary (e.g., presence of sleep problems) variables. As CoRGA only supports continuous and logical data, nominal variables were transformed into binary variables and ordinal variables were treated as continuous variables. Ordinal variables selected by CoRGA were subsequently checked using Statistical Package for the Social Sciences (SPSS, version 11; SPSS, Inc., Chicago, IL) to determine the hazard ratios (HRs), etc., for individual categories. Following transformation of the variables, a total of 460 independent variables derived from the questionnaire and interview were included in the CoRGA analyses; these included continuous and categorical variables.
At the commencement of each execution by CoRGA, an initial population of 50 chromosomes containing x genes was constructed at random, each gene representing one variable. The initial chromosomes were decoded into actual variables before each set of variables was entered into separate Cox regression models.
The Cox regression function of CoRGA regresses survival data and produces statistical descriptors, including a measure of goodness of fit of the model, Akaike's Information Criterion (AIC) (42), derived from the minus two log likelihood ratio to take account of the number of variables in the model (43). The AIC value takes positive values, and the lower the AIC values for a given set of independent variables the better is the goodness of fit of the model (42,43).
Each initial chromosome was ranked in order of increasing AIC value (decreasing goodness of fit). The best chromosome was retained to be compared with that from succeeding generations and replaced if it had a lower AIC value, i.e., better goodness of fit of the model. Other chromosomes with low AIC values, i.e., fit chromosomes, were also selected for survival and passed to the next generation. These selected chromosomes underwent replication, and genes were exchanged between chromosomes using random crossover to produce new chromosomes, i.e., combinations of variables. New genetic material was introduced using random single point mutation of genes. The new chromosomes and fittest chromosomes from the previous generation were then decoded into variables before being entered into separate Cox regression models for estimation of the fitness function as before, and the fittest chromosome of the new generation replaced the previous fittest chromosome, if its AIC value was better. This process was iterated until convergence was reached, i.e., no further improvement in the fitness of the Cox regression model (i.e., no decrease in the AIC value) was achieved over a large number of generations. CoRGA was executed six times with different sized chromosomes to identify the combinations of 1, 2, 4, 8, 12, and 16 variables as risk factors for 15-year mortality.
The final models were checked by including the combinations of selected variables in adjusted and unadjusted Cox regression models and calculating HRs and 95% confidence intervals (CIs) using SPSS.
| RESULTS |
|---|
|
|
|---|
|
|
|
Table 3 shows the HRs and 95% CIs (column 4) for each variable (and for each category relative to the reference category for ordinal variables) and their level of significance (column 5) in the adjusted models. The HRs and 95% CIs (column 6) for each variable and their level of significance (column 7) in the unadjusted models are also shown in Table 3.
Factors Associated With Increased Mortality in Adjusted Models
Age was independently predictive of increased 15-year mortality in all of the adjusted models, with the HRs varying from 1.082 (95% CI, 1.0701.094; p =.000) (age in years; 1 variable in model) to 1.117 (95% CI, 1.0971.136; p =.000) (exact age; 16 variables in model). In the adjusted models, self-rated activity compared to others in the same age group (p =.000), pain in the anterior left chest (p =.013), the ability to raise £200 in an emergency (p =.042), the number of floors in accommodation (p =.016), and the total number of prescribed drugs (including hypnotics) (p =.000) were significant risk factors for increased mortality in at least one model, independent of the other variables in the model.
Factors Associated With Decreased Mortality in Adjusted Models
The following selected variables were significant independent risk factors for mortality in at least one model: time since last visited the dentist (p =.047), reported stomach problems (p =.003), and permission to access Office for Census and Population Studies files (p =.002).
Factors Associated With Increased Mortality in Unadjusted Models
The following variables selected one or more times by CoRGA were significant risk factors for increased 15-year mortality on their own: age (p =.000), self-rated activity compared to others in same age group (p =.000), ability to raise £200 during an emergency (p =.005), time since last visited by social worker (p =.007), number of living great-grandchildren (p =.001), joint pain and stiffness causing difficulty in walking (p =.015), and total number of prescribed drugs including hypnotics (p =.000).
Factors Associated With Decreased Mortality in Unadjusted Models
The following variables selected one or more times by CoRGA were significant risk factors for decreased 15-year mortality on their own: maximum handgrip strength (p =.003), maximum handgrip strength for dominant hand (p =.001), handgrip strength of left hand at third measurement (p =.000), time since last visited the dentist (p =.003), maximum handgrip strength for nondominant hand (p =.005), handgrip strength of right hand at third measurement (p =.000), handgrip strength of right hand at first measurement (p =.000), and permission to access OCPS files (p =.000).
| DISCUSSION |
|---|
|
|
|---|
Validating the risk profiles developed by CoRGA using adjusted and unadjusted models in SPSS enabled the importance of selected variables in predicting mortality to be checked, individually and in combination, with other variables. GAs are probabilistic, rather than deterministic, and do not guarantee that the combinations of variables selected in the final models are necessarily the best risk factors for predicting mortality. However, the aim of this study was not to identify the best individual risk factors, but to examine all variables as possible risk factors and then select variables contributing to mortality risk.
The combinations of risk factors selected by CoRGA confirm current knowledge and provide additional information on mortality risk in older people. Age (13), handgrip strength (5), and prescribed medications (1) have been associated with mortality risk in older people. The inclusion of age as a risk factor in all models provides further evidence that this may be the single most important factor affecting mortality in older people (13).
The selection of handgrip strength in five models confirms its importance as a predictor of mortality (5). The association of increased handgrip strength with reduced mortality was probably indicative of frailty. Frailty appears important in its own right as a predictor of mortality, but also in combination with other risk factors. Whether the different variables for handgrip strength are differentially predictive of mortality is unclear: They may be measuring the same component of frailty, but the selection of handgrip strength at both first and third measurement in the models containing 12 and 16 variables suggests that initial and repeatable strength are important components of frailty risk.
Variables for reported pain were selected in most models, and although not significant in unadjusted models, pain in the anterior chest, presumably acting as a marker for cardiovascular problems, was a significant predictor in two adjusted models, suggesting that combined with other risk factors, it increases mortality risk. The association of stomach problems with reduced mortality was surprising, although this may suggest that people with stomach problems may receive additional care that is somehow protective.
CoRGA's value in identifying new potential risk factors for mortality was emphasized in selecting self-rated activity, i.e., relative to peers, in four models. Although self-rated health is an important predictor of mortality (6) and other outcomes (35), the association of self-rated activity with 15-year mortality has not previously been reported. Other selected variables not identified previously in the research literature may have been acting as markers for frailty, mobility, or poor health. For example, times since last visited the optician or the dentist may be acting as surrogates for general mobility or for being housebound as they require a person to leave home and travel some distance. In contrast, the increased mortality associated with being visited by a social worker may be a marker for poor psychological or physical health. The ability to raise £200 (approximately $350) in an emergency was probably a marker for poverty. Its significance in the presence of other risk factors suggests that older people with problems that increase mortality risk are particularly vulnerable if they have limited financial means and are not able to afford support unavailable through the free U.K. National Health Service. Granting permission to access the OCPS file was protective in reducing mortality risk: This may be a proxy for interviews completed by the respondent's caregiver, who may be less willing to allow access to files than respondents themselves. The association of number of floors in the accommodation with increased mortality in one adjusted model may indicate increased risk for frail older people who have to use stairs: They may be susceptible to falls, particularly if handgrip strength is poor, and falls involving stairs may have particularly serious outcomes. The association of the number of living great-grandchildren in the unadjusted model may be because this variable is acting as a proxy for very old age in this cohort.
Further research will help to understand the precise importance of these factors as predictors of mortality. CoRGA has enhanced current understanding of mortality risk by confirming the importance of known risk factors and by identifying new attributes that appear to be important in influencing the risk of mortality in older people.
| Acknowledgments |
|---|
| Footnotes |
|---|
Received May 13, 2004
Accepted July 13, 2004
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. Shinkai, P. H. M. Chaves, Y. Fujiwara, S. Watanabe, H. Shibata, H. Yoshida, and T. Suzuki {beta}2-Microglobulin for Risk Stratification of Total Mortality in the Elderly Population: Comparison With Cystatin C and C-Reactive Protein Arch Intern Med, January 28, 2008; 168(2): 200 - 206. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Takata, T. Ansai, S. Akifusa, I. Soh, Y. Yoshitake, Y. Kimura, K. Sonoki, K. Fujisawa, S. Awano, S. Kagiyama, et al. Physical Fitness and 4-Year Mortality in an 80-Year-Old Population J. Gerontol. A Biol. Sci. Med. Sci., August 1, 2007; 62(8): 851 - 858. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Litwin and S. Shiovitz-Ezra Network Type and Mortality Risk in Later Life Gerontologist, December 1, 2006; 46(6): 735 - 743. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||
| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|