<< 2011-2 >>
Department of
Computer Science
 

Evaluating the Contribution of ML and DM to Health Services Research: Analysis of Hospital Episode Statistics

Advances in information and communication technology have allowed a dramatic increase in the use of electronic databases to record information on health care. Very large databases now exist documenting aspects of secondary and primary health care in the UK. One such dataset is the HES database, which currently holds information on 70 million episodes of care in English hospitals from 1991-1997. The scale and complexity of these databases challenge traditional approaches to analysis and have limited the contribution such data can make to research and the understanding of health care. This project seeks to evaluate the application of developments in the fields of computer science, including machine learning, and database mining, to health services research.

The objectives of this work are:

The project will follow two major lines of enquiry in order to achieve the study objectives: the application of KDD to the question of the 'quality' of HES data, and a case-study combining KDD and traditional analytical approaches to examine a more focused research question, 'Which factors influence variation in the use of elective surgery in older people?'

A significant obstacle to the use of HES data in HSR has been concern over the validity of the data it contains. Whilst routine checks are performed on the completeness of some elements of the dataset, problems with data quality are often only apparent following ad hoc research. For example, attempts by the NHSE to calculate hospital-based death rates using HES data revealed errors in the classification of gender and problems with the capture of in-hospital mortality in a number of Trusts. KDD methods such as decision trees, clustering techniques and neural networks will be applied to HES in order to identify potentially anomalous patterns in the data such as discontinuities in time trends or outlier hospitals. Previously identified errors in HES data will be used both as learning-sets and to test the sensitivity and specificity of any predictive algorithms developed.

The second element of the project will examine the utility of KDD in investigating a more focused research question. As life expectancy in the UK has increased parallel technical advances have allowed the opportunity to offer elective surgery, such as total joint replacement (TJR) and cataract surgery, to older people. Whilst there is anecdotal evidence of variability in access to elective surgery for older people this question has not been widely investigated in the UK, and the HES database has not previously been used as the basis for research in this area. Initial analyses will be restricted to a sample of elective operations, chosen on the basis of a strong evidence-base for effectiveness in the elderly and to include common procedures such as TJR and cataract surgery. The primary outcome measure will be age-specific procedure rates by hospital. HES data will be used to establish procedure-specific catchment areas at the level of electoral wards for all hospitals included in analyses; these will be combined with census data to calculate population-based procedure rates. Variation in rates will be described and potential explanatory variables such as the overall age profile of the population, the presence of specialist rehabilitation services or the size of hospital will be explored.

Findings will be brought together in a critique of the contribution of KDD to HSR in large health care databases. Criteria used in this assessment will include: evidence of the detection through KDD of unanticipated but valid patterns in data; a comparison of the human and computer resources required using different analytical approaches; the views of users of HES data; the extent of generalisability of methodology to other health care databases.

Staff and Students

Christophe Giraud-Carrier
Timothy Langford

Publications

Collaborators

Paul Dieppe, Steven Oliver.

Suppport

This work is supported by MRC HSRC Grant.
© 1995-2012 University of Bristol  |  Terms and Conditions
About this Page