Guidance for Health Outcome Data Review and Analysis Relating to NYSDEC Environmental Justice and Permitting
Appendix A - Updated Examples (Draft 7/21/08)
Example 1
For the first example, a COC in an urban area was chosen. The COC is ZIP Code X, which meets the urban environmental justice criteria and is located in City Y and County Z. This COC is not in New York City.
Selection of comparison areas for the COC. Four comparison areas were selected to evaluate the disease rates in the COC in the context of a number of different settings.
- A comparison area made up of ZIP Codes in County Z that are similar to ZIP Code X in population density. This area will be similar to ZIP Code X in land use and urban/rural characteristics.
- A comparison area that is the remainder of the city that contains ZIP Code X (i.e., City Y minus ZIP Code X). This comparison area is in the local area and was chosen because the community will be familiar with the area and the type of land use in this area.
- County Z, which is the county that contains ZIP Code X. This is a larger area and represents average health status, but it has a smaller area than New York State excluding New York City and will be familiar to the community because it is in close proximity to the COC.
- New York State excluding New York City is a large area chosen to represent the average health status.
To determine the ZIP Codes that approximate the remainder of City Y, a map of ZIP Codes was overlaid onto a map of the city using commercial mapping software. Some ZIP Codes were clearly within the city. Other ZIP Codes lay partially within the city and partially outside the city. For each of these ZIP Codes, the proportion of the population that lay within the city boundaries was estimated by summing the population of the 2000 census blocks whose geographic centers (centroids) fell within the ZIP Code. ZIP Codes with greater than 50% of the population within the city boundaries were included in the remainder of City Y comparison area; six ZIP Codes were included. ZIP Code X was not included in this comparison area because the population of ZIP Code X was about 10% of the population of City Y, which may be great enough to affect the results of the analysis.
For the population density comparison area, the ZIP Codes in County Y with population density similar to ZIP Code X were selected using the methods described in Section III.A.4 of the HOD Work Group Report. Six population density groupings were created. ZIP Code X was categorized by population density into Group II (2,816 - 7,467 people per square mile), which has the second highest population density of the six groups in the county. The comparison area is made up of the six ZIP Codes in Group II with population density similar to that of ZIP Code X, but does not include ZIP Code X. The population density comparison area is similar to the remainder of City Y comparison area but not identical; the two comparison areas had five ZIP Codes in common.
Demographic data for ZIP Code X and the four comparison areas are shown in Table I. The remainder of City Y and the population density comparison areas are more similar to the COC than the larger comparison areas (County Z and New York State excluding New York City) in characteristics such as median household income and percentage minority.
Tabulation of asthma hospitalization data. Data for many health outcomes are made available by age groups. The specific age breakdowns vary by the health outcome and by the agency reporting the data, but the categories generally include an age group for children, one for adults, and one for the elderly. These three groups are sometimes broken down into smaller age groups. Young children are an important focus when asthma hospitalizations are reviewed since asthma hospitalization rates are known to be higher in young children. The age groups available for asthma hospitalizations when the 2004 report was being prepared were 0-4, 5-14, 15-24, 25-44, 45-64, and 65+ years (data for 1998-2000). At the present time, asthma hospitalization data by zip code are displayed on the NYSDOH web site for three age groups that cover the population age distribution (0-17 years, 18-64 years, and 65+ years) plus two age groups for children (0-4 years and 0-14 years).
The NYSDOH data used for this example included the number of asthma hospitalizations during 2003-2005, the 2004 population, and the average annual rate of asthma hospitalizations per year. Asthma hospitalization data for ZIP Code X and the four comparison areas are shown in Table II. As an example, the average annual rate of asthma hospitalizations per 100,000 population is calculated as follows for the 0-17 year old group in ZIP Code X (second row):
The data in Table II are shown for four age groups and the total population (0-65+ years) for ZIP Code X and the four comparison areas (parts 1-5). Table II parts 2-5 have an additional column marked "rate ratio (95% CI)." This column compares the data for ZIP Code X to that for the specified comparison area. For each age group, the rate ratio is the hospitalization rate for ZIP Code X divided by the hospitalization rate for the comparison area. As an example, for the 0-17 year old age group, the following rate ratio is calculated to compare the asthma hospitalization rate in ZIP Code X to that in the remainder of City Y (part 3):
The last row of the table shows information for the total population (0-65+ years combined). In parts 2-5, the different age distributions in the COC and the comparison areas have been taken into account in calculating a ratio called the age-adjusted standardized incidence ratio (SIR). In this last row, the SIRs are all greater than 1, which indicates that the asthma hospitalization rate for the total population in ZIP Code X is higher than the respective rates in each of the four comparison areas. As described previously, the confidence interval (CI) provides a range around the ratio in which the true measurement lies with a certain degree of confidence. The 95% CI was calculated by the method in Appendix B. In this case the 95% CI indicates that there is a 95% chance that the true value of the ratio is included in the range. In the last row of table II, the CIs around the SIRs all exclude the numeral 1 and appear in bold type; for these comparisons, there is 95% confidence that the asthma hospitalization rate in ZIP Code X is greater than the rate in the comparison area and that the difference is not due to chance.
In table II, we can also look at the results for the individual age groups. For the age groups 0-17, 18-64, and 65+ years, the rate ratios for each comparison are all greater than 1, indicating that the rate of asthma hospitalization is greater in ZIP Code X than in the respective comparison area. The CIs all exclude the numeral 1 and appear in bold type, indicating the difference is not likely due to random variability.
As discussed previously, children 0-4 years old are known to have the highest rates of asthma hospitalization. The rate for children 0-4 years old in ZIP Code X (759.5 hospitalizations/100,000 population) was higher than the rate for the same age group in each of the four comparison areas, but the differences did not reach the 95% confidence level for the comparisons with the ZIP Codes in County Z with similar population density (part 2) and with the remainder of City Y (part 3). The asthma hospitalization rates in young children in these two comparison areas were also relatively high and were similar to the rate in the COC.
Of interest in Table II is that the asthma hospitalization rates are higher for all age groups in the remainder of City Y than in County Z, indicating that rates in the county outside of City Y are lower than in City Y. This is a good example of how differences in disease rates among portions of a larger area such as a county might not be seen if one looks only at the rate for the larger area.
The conclusion that can be drawn from the data in Table II is that asthma hospitalization rates in the COC for 2003-2005 were higher than those in the city surrounding the COC, in the county in which the COC is located, and in New York State excluding New York City. Because there are similar results with multiple comparisons, there is added confidence that the asthma hospitalization rate is elevated in the COC.
Although review of asthma hospitalization is important in asthma surveillance, people who are hospitalized with asthma are only a subset of people with asthma. Asthma hospitalization rates differ from region to region and are known to be higher among poor and minority populations. National data from the U.S. CDC for 2004 showed that asthma hospitalization rates were about three times higher among blacks than whites (U.S. CDC, 2004). Poverty appears to be an important contributing factor to asthma disability (Healthy People 2010). ZIP Code X, the remainder of City Y, and the population density comparison areas have higher percentages of minorities and persons living in poverty than County Z and New York State excluding New York City (Table I). Research is currently being conducted on factors that may contribute to higher asthma hospitalization rates among poor and minority populations, including differences in access to medical care, source of care, medical management including type of medications, and provider-patient interactions.
Tabulation of cancer incidence data. Tables showing cancer incidence data by ZIP Code for female breast, lung, colorectal and prostate cancers for 1999-2003 are available at the NYSDOH web site. Cancer data that are available from NYSDOH are tabulated in a different way from the asthma hospitalization data discussed in the previous section. The asthma data are presented by age group, including number of hospitalizations, population, and hospitalization rate. For the cancer data, the number of cases of the specific type of cancer in the total population of the ZIP Code (all ages) is presented as the observed number, but the rate and number of people by age group are not provided. Instead, for each ZIP Code an expected number of cases of all ages is shown. The cancer rate for the entire state of New York and the number of people in a ZIP Code are used to estimate the number of people in each ZIP Code that would be expected to develop cancer within the five-year period 1999-2003 if the ZIP Code had the same rate of cancer as the state (for more information, see "Frequently Asked Questions". Age and population size are taken into consideration when determining the expected number; this process is called age-adjustment (see Appendix F of the HOD Work Group Report).
For some ZIP Codes in the cancer tables, there were too few cases to be shown for confidentiality reasons. These ZIP Codes are combined with neighboring ZIP Codes, and data are provided for the combined groups of ZIP Codes.
In Table III data on female breast and male and female colorectal cancers are shown for ZIP Code X and the two local comparison areas. This table is different from the table of asthma hospitalizations (Table II) in that the cancer data for ZIP Code X are not compared directly to the cancer data for the comparison areas; instead, since the expected number of cases is based on the cancer rate for New York State, the state is the comparison area for ZIP Code X and for the two local comparison areas. In this table, the ratio represents the ratio of the observed number of cases to the expected number of cases. Because the data have been age adjusted, the ratios are standardized incidence ratios (SIRs). (The SIRs in the table should not be compared with each other because of the different age distributions of the populations of each area.) If the observed number were equal to the expected number, the SIR would be equal to 1. As discussed previously, the CI is a range around the ratio in which the true measurement lies within a certain degree of confidence, in this case, 95%. When the CI excludes 1, there is 95% probability that the difference is not due to random variation. The 95% confidence interval is generally chosen by convention; however, additional confidence intervals could be displayed. The confidence intervals were calculated by the method in Appendix B.
The SIRs in Table III for ZIP Code X compared with New York State show that the breast cancer rate in females in ZIP Code X over this time period was lower than that for New York State. The 95% CI around the SIR excluded 1, indicating that the difference between the observed and expected numbers is not likely due to random variation. The number of cases of colorectal cancer in males was slightly higher and the number of cases of colorectal cases in females was slightly lower than the numbers expected. However, for colorectal cancer in males and females, the 95% CIs around the SIRs include 1, which indicates that the difference in the observed and expected numbers may be due to random variation.
When the observed numbers of female breast and male and female colorectal cancers in the two comparison areas were compared to the numbers expected based on the rates in New York State, the SIRs were all close to 1. The 95% CIs all include 1, indicating that the differences could be due to random variation.
The conclusion that can be drawn from the data in Table III is that the breast cancer rate in females for 2003-2005 in the COC was lower than the rate in New York State because the analysis showed that the difference in rates was not likely due to random variation. Since the analyses showed that the differences in the colorectal cancer rates in males and females may have been due to random variability, we conclude that the colorectal cancer rates in males and females for 2003-2005 in the COC were not significantly different from the rates in New York State. (An alternative statement would be that the colorectal cancer rates in males and females for 2003-2005 in the COC were similar to those in New York State.) In addition, female breast and male and female colorectal cancer rates in the two local comparison areas, which are both mostly located in the city surrounding the COC, were similar to those in New York State.
For the asthma hospitalization example, there was another large comparison area, which was the county containing the COC. We have not included this comparison area in the cancer example because cancer data for counties are not expressed in data tables at the NYSDOH web site as observed and expected numbers of cases in the same way as ZIP Code cancer data are expressed. However, cancer incidence rates by county and region are provided at the NYSDOH web site.
Example 2
For the second example, a COC in a rural area in western New York State was chosen. The COC comprises two ZIP Codes (A and B) and meets the rural environmental justice criteria for poverty.
Selection of comparison areas for the COC. Three comparison areas were selected to have enough information to evaluate the disease rates in the COC in the context of a number of different settings. One comparison area is made up of ZIP Codes in County C that are similar to ZIP Codes A and B in population density and are thus likely to be similar in urban/rural characteristics. This comparison area is located in the same county as the COC; the community is likely to be familiar with the type of land use in this area. Two comparison areas (County C and New York State excluding New York City) are larger areas and represent the average health status. Of these two areas, however, County C is smaller and is in closer proximity to the COC than New York State excluding New York City.
For the population density comparison area, the ZIP Codes in County C with population density similar to ZIP Codes A and B were selected using the methods described in Section III.A.4 of the HOD Workgroup Report. Four population density groupings were created, but the entire county is sparsely populated except for one village. ZIP Codes A and B were categorized by population density into Group III (38-68 people per square mile), which has the third highest population density of the four groups. The comparison area is made up of eight ZIP Codes in Group III with population density similar to that of ZIP Codes A and B, but does not include ZIP Codes A and B. Demographic data for ZIP Codes A and B and three comparison areas are shown in Table IV. The population density comparison area and County C are more similar to the COC than New York State excluding New York City in characteristics such as median household income and percent minority.
Tabulation of asthma hospitalization data. ZIP Codes A and B are sparsely populated, with only 12 asthma hospitalizations among residents of all ages during the three-year period 2003-2005. In each age group, there were fewer than 10 hospitalizations. Therefore, asthma hospitalizations are not shown for age groups in Table V for ZIP Codes A and B. (In part C of Section II of the HOD Work Group Report, the problem of unstable rates due to small numbers of cases is discussed.) When health outcomes are displayed for small areas that are sparsely populated, or when the health outcome is rare, there are also confidentiality concerns. It is NYSDOH's policy not to show 1 or 2 health outcomes in data tables for confidentiality reasons; the number is replaced with an asterisk in the tables. Therefore, it may be difficult to sum the number of cases of a particular outcome in a COC (study area) or comparison area made up of ZIP codes because some of the numbers may be replaced by an asterisk. If this were to happen, to estimate the number of cases, we would replace the asterisks in the COC with the number 2 and the asterisks in the comparison area with the number 1. Doing so increases the chances of detecting a difference between the rate in the COC and the rate in the comparison area by maximizing the difference between the two areas. A consequence of this practice that we accept is that the rate of the health outcome in the COC may be overestimated.
In Table V asthma hospitalizations are presented for the total population of ZIP Codes A and B as well as for the three comparison areas. In order to compute a total number of asthma hospitalizations for the population density comparison area, we replaced the asterisks with the number 1. As can be seen in Table V, the asthma hospitalization rate in ZIP Codes A and B is lower than that in all three comparison areas. The CIs all contain the numeral 1 indicating that the difference between the rates may be due to random fluctuation. (See Example I and Appendix B for the methods used to calculate rates, ratios, CIs, and age-adjusted ratios.)
The conclusion that can be drawn from Table V is that the asthma hospitalization rate for 2003-2005 in the COC did not differ significantly from the rate in an area in the county with similar population density, from the rate in the county as a whole, or from the rate in New York State excluding New York City.
Tabulation of Cancer Incidence Data. As discussed in Example I, observed and expected numbers of cancer cases are tabulated by ZIP Code for certain cancer sites at the NYSDOH web site. When there are too few cases to be shown for confidentiality reasons, the New York State Cancer Registry combines ZIP Codes with neighboring ZIP Codes and provides the data in maps and tables for the combined groups of ZIP Codes. In sparsely populated areas, ZIP codes are frequently combined. The following convention will be followed when ZIP Codes have been combined. When a ZIP Code that has been designated for the study or comparison area includes an additional ZIP Code, data for that additional ZIP Code will also be included in the observed and expected tabulations. However, when a ZIP Code that has been designated for the study or comparison area has been incorporated into a ZIP Code that is out of the study or comparison area, then data from that ZIP Code will be excluded from the tabulations.
Information on female breast and male and female colorectal cancers is shown in Table VI. The study area and comparison areas presented in Table VIA do not match exactly the areas for asthma hospitalizations in Table V. In the NYSDOH data tables, an additional ZIP Code has been combined with ZIP Codes A and B; therefore, the study area includes the COC (ZIP Codes A and B) and a third ZIP Code. Eight ZIP Codes were originally specified for the population density comparison area. One was dropped because it was combined in the NYSDOH cancer data tables with a ZIP Code not originally selected for the comparison area, and four were added because they were combined with a ZIP Code that was selected for the comparison area. There are a total of 11 ZIP Codes in the population density comparison area in Table VI.
As discussed in the previous example, Table VI is different from the table of asthma hospitalizations in that the cancer data for the study area (which includes ZIP Codes A and B) are not compared directly to the cancer data for the comparison areas. Instead, the expected number of cases is based on the cancer rate for New York State, i.e., the state becomes the comparison area for the study area and for the comparison areas. In Table VI, the ratio represents the ratio of the observed number of cases to the expected number of cases. Because the data have been age-adjusted, the ratios are SIRs. If the observed number were equal to the expected number, the SIR would be equal to 1. As discussed previously, the CI is a range around the ratio in which the true measurement lays with a certain degree of confidence, in this case 95%. When the CI excludes 1, there is 95% chance that the difference between the rates is not due to random variation. See Appendix B for calculation of confidence intervals.
The SIRs in Table VI for the study area (ZIP Codes A and B plus another ZIP Code) compared with New York State show that the breast cancer rate in females was similar to that in New York State. The CI for this comparison includes 1, indicating that the difference could be due to random variation. There were very few colorectal cancers in males (4 cases) and females (3 cases) in the COC during 1999-2003. With such small numbers of cases, an increase or decrease of one or two cases per year can cause a dramatic fluctuation in the rate; thus, the rates are unstable and should be interpreted with caution. The CIs for these comparisons include the numeral 1, indicating that the differences could be due to random variability.
In the second part of Table VI, the observed numbers of female breast and male and female colorectal cancers in the population density comparison area are compared to the numbers expected based on the rate in New York State. For breast cancer in females, the rate in the population density comparison area was lower than the rate in New York State; the CI excludes 1, indicating that the difference is not likely due to random variation. For colorectal cancer in males and females, the rate ratios are close to 1 and the CIs include 1, indicating that the differences could be due to random variation.
Because of the small numbers of cancer cases in the study area, conclusions must be drawn with caution. The breast cancer rate in females for 2003-2005 in an area approximating the COC was not significantly different from that in New York State. There were too few cases of colorectal cancer in the COC to draw reliable conclusions. In an area of the county with similar population density, the breast cancer rate in females was lower than that in New York State, while the colorectal cancer rates in males and females were not significantly different from those in New York State.
References
- U.S. Center for Disease Control and Prevention. National Surveillance for Asthma - United States, 1980-2004. MMWR Surveillance Summaries 56(8):1-54, 2007.
- Health People 2010. Respiratory Diseases
^{*}Minority includes Hispanics, African-Americans, Asian-Americans, Pacific Islanders and Native Americans, and others.
^{1} U.S. Bureau of the Census. 2000 Census of population and housing summary file 1 (SF1). U.S. Department of Commerce. 2001.
^{2} U.S. Bureau of the Census. 2000 Census of population and housing summary file 3 (SF3). U.S. Department of Commerce. 2002.
Age group (years) | Part 4: County Z | Part 5: New York State excluding New York City | ||||||
---|---|---|---|---|---|---|---|---|
Hospitalizations | 2004 Pop. | Rate* | Rate ratio^{T} (95% CI) | Hospitalizations | 2004 Pop. | Rate* | Rate ratio^{T} (95% CI) | |
0 - 4 | 215 | 16,072 | 445.9 | 1.70 (1.01, 2.69) |
7,728 | 661,788 | 389.3 | 1.95 (1.16, 3.08) |
0 - 17 | 368 | 63,770 | 192.4 | 2.54 (1.81, 3.47) |
13,581 | 2,638,173 | 171.6 | 2.85 (2.03, 3.89) |
18 - 64 | 557 | 193,303 | 96.1 | 3.49 (2.67, 4.47) |
19,792 | 6,964,631 | 94.7 | 3.54 (2.71, 4.54) |
65+ | 240 | 41,359 | 193.4 | 4.23 (2.68, 6.34) |
8,727 | 1,520,205 | 191.4 | 4.27 (2.71, 6.41) |
TOTAL (0-65+ years) | 1,165 | 298,432 | 130.1 | 3.22 (2.67, 3.83)^{TT} |
42,100 | 11,123,009 | 126.2 | 3.39 (2.82, 4.04)^{TT} |
Source: SPARCS, 2003-2005.
CI = confidence interval.
*Average annual rate of asthma hospitalizations per 100,000 population.
^{T}Rate in ZIP Code X is numerator; rate in comparison area is denominator.
^{TT}Age-adjusted standardized incidence ratio, using 3 age groups (0-17, 18-64, 65+ years).
Source: New York State Cancer Registry's Cancer Surveillance Improvement Initiative, 1999-2003.
CI = confidence interval.
*The cancer rate for the entire state of New York and the number of people in a ZIP code are used to estimate the number of people in each ZIP code that would be expected to develop cancer within the five-year period 1999-2003 if the ZIP code had the same rate of cancer as the state.
^{T}Ratio of the number of cases observed to the number of cases expected. Because the data have been age-adjusted, the ratios are standard incidence ratios (SIRs).
* Minority includes Hispanics, African-Americans, Asian-Americans, Pacific Islanders and Native Americans, and others.
^{1} U.S. Bureau of the Census. 2000 Census of population and housing summary file 1 (SF1). U.S. Department of Commerce. 2001.
^{2} U.S. Bureau of the Census. 2000 Census of population and housing summary file 3 (SF3). U.S. Department of Commerce. 2002.
NOTE. If there are 1 or 2 cases in a ZIP Code, NYSDOH suppresses the number of cases in data tables for confidentiality reasons and displays an asterisk. To compute the rate, for the study area each asterisk was replaced with 2 cases. For the population density comparison area, each asterisk was replaced with one case. This represents the greatest possible difference between the 2 areas.
Source: SPARCS, 2003-2005. 2004 Population from Claritas Corp.
CI = confidence interval.
*Average annual rate of asthma hospitalizations per 100,000 population.
^{T}.Not age-adjusted. As compared with study area including ZIP Codes A and B.
Source: New York State Cancer Registry's Cancer Surveillance Improvement Initiative 1999-2003.
CI = confidence interval. If a ZIP code in the study or comparison area includes an additional ZIP code, then data from the additional ZIP code are also included in the observed and expected tabulations. However, if a study or comparison area ZIP code is included in another ZIP code that is out of the study or comparison area, then data from that ZIP code are excluded from the tabulations. See Guidance Document.
*The cancer rate for the entire state of New York and the number of people in a ZIP Code are used to estimate the number of people in each ZIP Code that would be expected to develop cancer within the five-year period 1999-2003 if the ZIP Code had the same rate of cancer as the state.
^{T}Ratio of the number of cases observed to the number of cases expected. Because the data have been age-adjusted, the ratios are standardized incidence ratios (SIRs).
^{TT}Ratios based on fewer than 10 cases; rates may be unstable and should be interpreted with caution.