Using capture‐recapture methods to estimate influenza hospitalization incidence rates

1 INTRODUCTION

Policy makers and planners need accurate estimates of the incidence or prevalence of diseases and health conditions to anticipate, prevent, and mitigate the effects of those diseases. The decentralized nature of U.S. healthcare makes overall population burden estimates difficult to calculate. Thus, policy makers must rely on population sampling for estimates, leaving true population burden unknown. The accuracy of population estimates depends largely upon the quality of sampling, which in turn is dependent upon many factors including consistency of the population being sampled across all capture occasions,1 and affected individuals having contact with the healthcare system to allow for enumeration. Cases of unreported disease are more difficult to account for.

Statistical methods to improve population disease incidence and prevalence detection include the capture–recapture (C-R) method. The Lincoln–Petersen (Petersen) method was the earliest C-R method for estimating population size. It was developed for field studies of animals in which only a sample of the population can be caught and marked (captured). Frequently, animals are re-caught (recaptured), and this method allows for recaptured animals to improve the population estimate.2 In health-related research, C-R uses the overlap of subjects from two or more data sources and log-linear methods to more accurately estimate true population disease burden.3 C-R has the advantage of using both research and clinical databases to measure the same data, thereby creating a fuller perspective; however, the best method to adjust for complex denominators in urban areas with competing health systems is not straightforward within C-R.

The Centers for Disease Control and Prevention (CDC) has developed methods to estimate population influenza burden (L Kim, personal communication, 2020) that account for some of the complexities of multicenter study design and the incomplete nature of surveillance data of the Hospitalized Adult Influenza Vaccine Effectiveness Network (HAIVEN) study. The current study used available data from a single health system, HAIVEN methods and adjustments, and C-R calculations, to estimate adult influenza hospitalization burden in Allegheny County in Southwestern Pennsylvania.

2 METHODS

The study was approved by the University of Pittsburgh IRB and consists of a three-phase analytic plan. The phases are as follows: (1) C-R to more accurately estimate influenza cases; (2) statistical analyses for population burden based on the CDC HAIVEN network methods (Lindsay Kim, MD, personal communication); and (3) an adjustment to this resultant population burden using C-R incidence estimates. The HAIVEN population burden methods may not account for the richness of the clinical virology data available in our particular locale. Our adjustment to the HAIVEN methods was intended to capitalize on both the robust nature of the HAIVEN methods and the richness of the local virology data.

2.1 Phase 1: Statistical analyses for C-R

Data used for this analysis were collected from two sources: (1) the local health system's clinical surveillance software system (Theradoc®), which extracts virology test results from the electronic medical record (EMR); and (2) research data from selected hospitals participating in the HAIVEN study.

An IRB-approved honest broker extracted a data list from Theradoc® of Allegheny County residents who received an inpatient clinical respiratory viral panel (RVP) test at two to five (depending upon the season) general acute care hospitals in the health system during the study period that included the 2015–2016 through 2018–2019 influenza seasons. This list also contained basic demographic data of race, sex, and age and is henceforth called the “clinical” database. The clinical database was prepared for analysis by limiting it to data from hospitals in which research enrollments were taking place for each influenza season. For example, in 2015–2016, two hospitals were enrolling participants, whereas in 2018–2019, there were five participating hospitals. Secondly, data were limited to the periods during which research enrollments were taking place. Thirdly, patients <18 years of age were eliminated. Finally, patients were separated into influenza cases and non-cases. The “research” database was derived from adults who were recruited from the hospitals during the 2015–2016 through 2018–2019 influenza seasons for the HAIVEN study that only included inpatients ≥18 years of age. Detailed study methods for the HAIVEN study have been described elsewhere.4 Briefly, patients aged ≥18 years admitted with an acute respiratory infection (ARI) including cough or worsening symptoms of a respiratory illness beginning within 10 days were enrolled. Patients who had been enrolled in the prior 14 days were ineligible. Following informed consent, study staff collected respiratory specimens (nasal and throat swabs from patients) for influenza virus testing (including virus type and subtype) by reverse-transcription polymerase chain reaction (RT-PCR) or used results from a clinical RVP test, if available. Demographic data were obtained from interview. Vaccination status was based on documented receipt of each year's influenza vaccine from the local electronic health record and/or the Pennsylvania Statewide Immunization Information System (PA-SIIS).

The following variables were used in the calculations:M = number cases identified in the clinical database;n = number of cases identified in the research database;m = number of cases identified in both databases (matched);N1 = number of cases reported only in the clinical database;N2 = number of cases reported only in the research database;X = number of cases missing/not captured in either database.

Summary statistics of the demographic and clinical characteristics were determined for the patients found in the matched database. The number of observed influenza cases in the two databases and the Petersen's C-R method were used to estimate influenza incidence (urn:x-wiley:17502640:media:irv12924:irv12924-math-0001).5 urn:x-wiley:17502640:media:irv12924:irv12924-math-0002(1) The variance and 95% confidence intervals (CIs) were calculated for the C-R estimates using the formulae: urn:x-wiley:17502640:media:irv12924:irv12924-math-0003(2) urn:x-wiley:17502640:media:irv12924:irv12924-math-0004(3) The C-R calculations were made assuming that (1) the population is closed; that is, there was no outmigration or loss to follow-up because the capture and recapture would have usually occurred during the same hospitalization. Calculation of completeness of reporting by the two sources of the C-R method is determined by calculating the number of missing cases, Xurn:x-wiley:17502640:media:irv12924:irv12924-math-0005(4)

An example of a C-R estimate is shown in Table S1.

Secondly, it is assumed that the populations are homogeneous; that is, each hospitalized patient has the same and constant probability of being captured by any combination of the databases. Thirdly, it is assumed that the clinical database and the HAIVEN research database are independent of each other. That is, the population estimate assumes that the probability of being captured by one source does not affect the probability of being captured by the other source.6 Independence can be tested by calculating the probability of influenza positives being captured by both databases. If that probability is equal to the product of the marginal probabilities of being influenza positive captured by clinical and research databases, then the samples are independent. urn:x-wiley:17502640:media:irv12924:irv12924-math-0006(5)

Independence was tested for the 3-year total samples and the 15 subpopulations derived by stratifying on demographic factors (age, sex, and race), influenza season, vaccination status, and prior vaccination status (Table S2).

2.2 Phase 2: Statistical analyses for population burden based on HAIVEN methods

The Pennsylvania Health Care Cost Containment Council (PHC4) provided data for ARI-specific hospitalizations based on CDC ARI ICD codes in all county hospitals for all 4 years of the study.

The HAIVEN methods to calculate disease burden estimates were used as follows: urn:x-wiley:17502640:media:irv12924:irv12924-math-0007(6)

Proportion of all ARI cases in all county hospitals represented by the study-specific hospitals was determined using quarterly data from PHC4.

r = cases identified through research enrollment (research cases) among Allegheny County residents.

V1 = number of ARI hospitalizations from county residents who are enrolled in the research database

V2 = number of ARI hospitalizations in study-specific hospitals during influenza months among Allegheny County residents in PHC4 database.

V3 = number of ARI hospitalizations in all county hospitals during the same time period among Allegheny County residents in PHC4 database; rationale is that both V2 and V3 should come from the same database.

V4 = number of influenza cases in study-specific hospitals during research enrollment period from clinical database.

V5 = number of influenza cases in study-specific hospitals over the entire year from clinical database; rationale is that both V4 and V5 should come from the same database. urn:x-wiley:17502640:media:irv12924:irv12924-math-0008(7) 2.3 Phase 3: Combination of C-R and HAIVEN methods for adjusted population burden To incorporate the C-R method into the HAIVEN methods to account for cases estimated by C-R but not due to the enrollment fraction, the following modification of Equation 7 was used: urn:x-wiley:17502640:media:irv12924:irv12924-math-0009(8)

For influenza burden calculations by race, the county population used as the denominator was the total adult population of the county multiplied by 0.78 for Whites and 0.13 for Blacks, representing their relative proportions of the population. Data were analyzed using SAS version 9.4 (SAS Institute, Cary, NC, USA).

3 RESULTS

The viral test result analytic databases are shown in Figure 1. The clinical database consisted of 8,994 patients of whom 7,684 patients were unmatched; the research database consisted of 2,154 patients of whom 844 were unmatched patients; and 1,310 patients were found in both databases (matched). Demographic characteristics of the patients found in the matched database are shown in Table 1. The highest proportion of the group was 50–64 years old (34.3%) with less than one quarter each among patients who were 65–74 and +75 years old and one fifth who were 18–49 years old. The patients were predominantly White (64.7%), female (62.9%) and vaccinated ≥14 days prior to illness onset (58.5%). Half of them had been vaccinated in the previous season and 25.7% were influenza cases.

image

Flow chart for clinical and research databases in study-specific hospitals resulting in the final analytic database, including influenza status

TABLE 1. Demographic characteristics of patients identified in the matched database (N = 1,310) Variable n (%) Age group 18–49 years 249 (19.0) 50–64 years 449 (34.3) 65–74 years 311 (23.7) 75 + years 301 (23.0) Race White 848 (64.7) Black 421 (32.1) Other, unknown 41 (3.2) Sex Female 824 (62.9) Male 486 (37.1) Season 2016–2017 356 (27.2) 2017–2018 449 (34.3) 2018–2019 505 (38.5) Vaccination status Unvaccinated 447 (34.1) Vaccinated ≥14 days prior to illness onset 766 (58.5) Vaccinated <14 days prior to illness onset 97 (7.4) Prior year vaccination (total for all seasons) No 645 (49.2) Yes 665 (50.8) Influenza status No 973 (74.3) Yes 337 (25.7)

Table 2 shows the observed cases among persons hospitalized with a cough illness in the clinical, research and matched databases, and C-R estimated influenza hospitalizations over all seasons, by season and by other factors.

TABLE 2. Estimated population influenza hospitalizations using the capture-recapture method Observed influenza cases C-R estimated influenza cases urn:x-wiley:17502640:media:irv12924:irv12924-math-0010a Clinical (M) Research (n) Matched (m) Total (m + N1 + N2) 3- year total 308 313 284 337 339 (336, 342) Age group 18–49 years 55 54 50 59 59 (58, 60) 50–64 years 94 95 85 104 105 (103, 107) 65–74 years 82 82 75 89 90 (88, 92) 75+ 77 82 74 85 85 (84, 86) Raceb White 191 199 179 211 212 (210, 214) Black 103 100 92 111 112 (110, 114) Sex Female 191 195 175 211 213 (210, 216) Male 117 118 109 126 127 (125, 129) Season 2016–2017 84 82 77 89 89 (88, 90) 2017–2018 139 138 127 150 151 (149, 153) 2018–2019 85 93 80 98 99 (97, 101) Vaccination statusb Unvaccinated 124 124 111 137 139 (136, 142) Vaccinated 165 169 155 179 180 (178, 182) Prior vaccination No 167 167 153 181 182 (180, 184) Yes 141 146 131 156 157 (155, 159) Note. N1 = number of cases reported only in the clinical database; N2 = number of cases reported only in the research database. a urn:x-wiley:17502640:media:irv12924:irv12924-math-0011 b Sum ≠ 337 due to missing data.

The HAIVEN population influenza burden estimates (using Equation 7) and the HAIVEN + C-R (using Equation 8) population influenza burden estimates were calculated using the values shown in Table 2. Over all three influenza seasons, the average incidence rates for hospitalized influenza in the research hospitals were 307–309/100,000 (HAIVEN and HAIVEN + C-R, respectively) (Table 3). The lowest rates over all seasons were 17/100,000 (for both HAIVEN and HAIVEN + C-R) among 18–49 year olds. The highest seasonal rates for the entire adult population were 494–497/100,000 (HAIVEN and HAIVEN + C-R, respectively), in 2017–2018, an especially severe influenza season. Although there were two to three times as many cases among Whites as Blacks, the influenza burden per 100,000 population were 10%–30% higher among Whites than Blacks.

TABLE 3. Influenza hospitalization estimates per 100,000 Allegheny County adult population using CDC HAIVEN and capture-recapture methods Adjustment factors Influenza burden

Variable

Influenza cases in study-specific hospitals among Allegheny County residents

(source = research database)

(r)

ARI cases in study-specific hospitals among Allegheny County residents (source = research database)

(V1)

ARI cases in study-specific hospitals during influenza months among Allegheny County residents (source = PHC4)

(V2)

ARI cases in all county hospitals during influenza months among Allegheny County residents (source = PHC4)

(V3)

Influenza cases in study-specific hospitals during research enrollment period (source = clinical database)

(V4)

Influenza cases in study-specific hospitals over the entire year (source = clinical database)

(V5)

Cases/100 K Allegheny County adult population

(based on equation 7)

Cases/100 K Allegheny County adult population

(based on equation 8 and using C-R estimates)

3 years total 385 2,154 13,863 46,450 1,345 1,,487 921 927 Annual average 307 309 2016–2017 Overall 107 665 4,439 16,171 289 316 286 286 18–49 years 18 125 656 1,950 63 65 29 29 50–64 years 27 235 1,154 3,851 58 66 51 51 65–74 years 31 153 1,016 3,458 59 68 81 81 +75 years 31 152 1,613 6,912 109 117 152 152 White 77 481 3,323 14,071 215 236 318 318 Black 26 166 1,082 2,066 68 75 276 276 2017–2018 Overall 168 656 4,135 16,353 539 633 494 497 18–49 years 35 142 612 1,955 126 143 55 55 50–64 years 48 213 1,039 3,838 121 153 110 110 65–74 years 42 157 997 3,644 108 124 112 110 +75 years 43 144 1,487 6,916 184 213 240 240 White 103 398 3,051 13,899 361 426 546 552 Black 57 241 1,084 2,324 159 189 504 504 2018–2019 Overall 110 833 5,289 13,926 517 538 192 194

留言 (0)

沒有登入
gif