| |
|
|
CHARTing Health Information for Texas-->Common Data Problems As you view these links to various sources of health data, keep in mind that data does not tell the whole story. There may be a story behind the data. As you will see, data and data gathering are not perfect-- if you see an anomaly or large deviation in the data, find out why. Don't assume. However, once you look over some of the caveats detailed on this page, realize that there is still comparability amongst the data once you understand the anomalies! And whenever possible, take a look at the technical notes for more insight into the data. Here are some examples of how data can mislead. 1. Do you know what data is really gathered? Be certain you understand the rules of data gathering before you try to interpret the data! Below are 3 examples of how the data gathering can mislead. A. Births: Changes to the Texas Birth Certificate
(From Texas Health Data Births to Texas Residents) In this example, there is a fairly substantial jump in the increase of babies born to unmarried mothers in 1994-- over 5,300-- and a corresponding decrease-- over 6,700-- to those born to married mothers. What happened? It turns out that the shift was not in behavior per se, but in what data was gathered by the state of Texas. Prior to1994, birth certificates did not indicate marital status, but it was assumed when compiling the statistics that the woman was married if the father's name was listed. After 1994, when marital status was specifically asked about, a truer picture emerged. (Thanks go to Dr. Bill Spears for ferreting that out and sharing it with me.) B. Deaths: Changes to the Texas Death Certificate
(From Texas VitalWeb) In this example, we see a fairly substantial jump in the number from 1988 to 1989, then another jump from 1989 to 1990. Were there really that many more deaths from diabetes? Most likely not. If anything, death from diabetes were probably underreported prior to 1989. However, the Texas Death Certificate was changed in 1989 to include an example on the back, with diabetes used in the example. Immediately, the death rate for diabetes increased. Unfortunately for Texas, deaths due to diabetes continue to rise. Nearly one third of the deaths in Texas in 2002 were attributed to Diabetes Mellitus (Texas Health Data Deaths of Texas Residents) although some of the increase could have been a result of the change from ICD-9 to ICD-10. In the case of diabetes, deaths attributed to diabetes rose slightly (less than 1% or a comparability ration of 1.0082). (See below for more discussion on ICD-9 vs. ICD-10.) (Thanks go to Daniel Goldman for ferreting out the death certificate change and sharing it with me.) C. Infectious Diseases: Changes to the list of notifiable diseases In 1994 there were 49 infectious diseases notifiable at the national level. In 1998, there were 52. Not knowing when an infectious disease becomes notifiable can lead to a misinterpretation of the data. Looking at chlamydia, for example, we see the following data for Harris County.
(From the Landscape Project) There is what appears to be a chlamydia epidemic in Harris County. Between 1990 and 1999, the number of chlamydia cases nearly tripled. All of Texas is like that-- there appears to have an epidemic that swept the state. But wait-- when did chlamydia become a notifiable disease? Based on the data, we might guess that it was 1995 as that is when we see a large increase in both count and rate. Prior to 1995, chlamydia was only voluntarily reported; it became a notifiable disease in 1995. Learn more about data reporting for chlamydia. The CDC lists other concerns when interpreting data: "Incidence data in the Summary are presented by the date of report to CDC as determined by the MMWR week and year assigned by the state or territorial health department.....Thus, surveillance data reported by other CDC programs may vary from data reported in the Summary because of differences in 1) the date used to aggregate data (e.g., date of report, date of disease occurrence), 2) the timing of reports, 3) the source of the data, 4) surveillance case definitions, and 5) policies regarding case jurisdiction (i.e., which state should report the case to CDC). The data reported in the Summary are useful for analyzing disease trends and determining relative disease burdens. However, these data must be interpreted in light of reporting practices. Some diseases that cause severe clinical illness (e.g., plague and rabies) are most likely reported accurately if they were diagnosed by a clinician. However, persons who have diseases that are clinically mild and infrequently associated with serious consequences (e.g., salmonellosis) might not seek medical care from a health-care provider. Even if these less severe diseases are diagnosed, they are less likely to be reported. The degree of completeness of data reporting also is influenced by the diagnostic facilities available; the control measures in effect; public awareness of a specific disease; and interests, resources, and priorities of state and local officials responsible for disease control and public health surveillance. Finally, factors such as changes in the case definitions for public health surveillance, introduction of new diagnostic tests, or discovery of new disease entities can cause changes in disease reporting that are independent of the true incidence of disease." Take a look at a list of notifiable diseases as well as statistics from the MMWR Summary of Notifiable Diseases for 1993 through 2004. The most current list is available from the Division of Public Health Surveillance and Informatics. 2. How have standards changed in the reporting or collection of data? A. Has an age-adjustment been made on
the data? If so, which standard was used? B. Which international classification of disease
(ICD) revision was used to report the data? C. Is the mortality data measuring underlying cause of death
or multiple causes of death? 3. What is the unit of measure of the data? Be sure you understand the unit of measure so that you can compare apples to apples. You cannot compare a non-adjusted rate with an age-adjusted rate. And quite honestly, you probably do not want to compare crude rates if there are several years separating them (i.e. a decade), especially in areas that are rapidly changing. It is possible to calculate an age-adjusted rate fairly easily. A general epidemiology book will explain how. 4. What has changed in medicine to affect the data? Another example is infant mortality. Rates increased in the United States in 2002, from 6.8 deaths per 1,000 births in 2001 to 7.0 deaths in 2002. What has happened? The causes are not fully known yet, but the CDC has some thoughts on the reasons why. Take a look at the "Supplemental Analyses of Recent Trends in Infant Mortality." Again, medical technology could have influenced infant mortality. What was once a miscarriage is now a preterm delivery. 5. How has the population changed? 6. Is the data reporting self-reported behaviors? 7. Is the frequency of events or the population (or both)
a very small number? For example, in McMullen County, 3 white males are reported to have died from cancers of the brain and nervous system between 1970 and 1994. This resulted in a 19.50 mortality rate, but a confidence interval between 0.00 and 44.17. (From Cancer Mortality Maps and Graphs, National Cancer Institute) The total population in McMullen County was only 817 in 1990 according to the US Census Bureau. Compare this to the 11,946 lung cancer deaths among white men in Harris County between 1970 and 1994. This corresponds to a mortality rate of 78.78, and a confidence interval of 77.32 to 80.25. Because the number of deaths is high (the numerator), the confidence interval is much smaller. 8. Other issues raised by Texas Department of State Health
Services |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Copyright © 2004 The University of Texas Health Science Center at Houston Privacy Policy |
Contact: sphwebmaster@uth.tmc.edu |