Accuracy and economic evaluation of screening tests for undiagnosed COPD among hypertensive individuals in Brazil

Study design and population

We conducted a cross-sectional, screening test accuracy study to assess six different screening tests and their combinations for detecting COPD patients in Brazil. Study assessments were performed in nine basic health units, eight urban and one rural, in the city of São Bernardo do Campo, São Paulo, Brazil.

Between February and October 2019, eligible patients aged ≥40 years with clinician diagnosed hypertension who attended routine consultations at their registered Basic Health Unit were invited to attend a separate study assessment. Patients were excluded if they were unable to perform spirometry (dementia, lack of teeth, lack of coordination or not having a good oral seal), had contraindications for spirometry (respiratory infection, bloody cough in the last month, severe angina, systolic blood pressure ≥220 mmHg or diastolic blood pressure ≥120 mmHg), had a history of tuberculosis, cardiac infarction, retinal detachment or surgery on the chest, abdomen, brain, ears or eyes in the last 3 months, or had a prior adverse reaction to Salbutamol.

Study assessment

Participants provided informed consent, prior to measurement of height and weight, and completion of all index and reference tests. They also completed a study questionnaire (Supplementary Methods 1) to provide information about demographics, smoking status, medical diagnoses, quality of life and respiratory symptoms. The questionnaire was developed in English, and translated to Portuguese by a professional translator, checked by Portuguese speaking clinicians to ensure the meaning of questions was preserved. For questionnaire items such as the CAPTURE questionnaire, where an existing validated Portuguese version was available, this was used. Case report forms were used to capture study assessment data and all data were entered on a secure online REDCap database36,37. We used a paired design, whereby all participants performed the index and reference tests during a single study assessment visit.

Index tests

The index tests included peak flow (QVAR Mini Wright®, cut-point <350 l/min men, <250 l/min women)21, pre-bronchodilator microspirometry (Vitalograph-COPD6®, cut-point FEV1/FEV6 < 0.78)38, and four screening questionnaires, including COPD Diagnostic Questionnaire (CDQ, cut-point ≥20)22,39, CAPTURE (cut-point ≥ 2)21, COPD Screening Questionnaire (COPD-SQ, cut-point ≥ 16)27, and the symptom-based questionnaire (SBQ, cut-point ≥ 17)28. The selection of questionnaires maximized symptoms being assessed and minimized duplication of items, while allowing comparison of the most relevant questionnaires. The CDQ, CAPTURE, COPD-SQ, and SBQ items were included in the study questionnaire without repetition (Supplementary Methods 2); the full list of all four tools is available in Supplementary Methods 3.

Peak flow and microspirometry were conducted before questionnaires, with the order of the airflow measurement device alternating at each assessment to reduce learning bias. Trained researchers explained how to use both lung function tests, and participants performed three pre-bronchodilator maneuvers on each device. For each test, the best values from any of the three maneuvers were used for analysis.

Participants completed the questionnaires after receiving 400 micrograms of Salbutamol. Questionnaires were intended to be self-completed, but researchers could assist if required.

Reference test

The reference test comprised post-bronchodilator quality diagnostic spirometry (ndd Easy On-PC) with clinical review to confirm COPD. Spirometry was administered by a second trained researcher who was unaware of the prior airflow measurement test results, between 20 and 60 min after bronchodilation, aiming for repeatability within 100 ml or 5%, within six efforts.

We assessed lung function with a spirometer that displayed and printed out the flow volume curve. The curves were classified according to the criteria of the ATS/ERS task force on standardization of lung function testing40. Tests with at least three curves, meeting these criteria, were “good.” “Usable” tests contained at least one curve that concurred with the criteria, allowing accurate assessment of FEV1. If accurate assessment was not possible the curves were classified as “unacceptable” and the test was excluded from analysis. All traces were over-read for quality by independent respiratory experts and graded according to standard criteria40, without knowledge of the index test results. Airflow obstruction was defined by the lower limit of normal (LLN) using Global Lung Initiative (GLI) equations.

A pulmonologist conducted clinical reviews with all participants whose diagnostic spirometry was below the LLN. If post-bronchodilator reversibility of FEV1 was ≥12%, and >400 mls patients were classed as having asthma, and were defined as reference test negative. Those with FEV1 reversibility ≥12% between 200 and 400 mls and a history of Asthma or allergies were classed as having asthma/COPD overlap. All others reviewed by the pulmonologist were classed as having COPD alone. The latter two groups were defined as reference test positive. These thresholds were based on local clinical guidelines in Brazil, and are in line with international diagnostic recommendations.

Sample size

A pragmatic target of recruiting 120 patients per BHU was set, to obtain a total sample of approximately 1080 participants. Using the Alonzo method for paired test accuracy studies41, assuming independence of tests and a prevalence of 16%, we would have 85% power to detect a difference in sensitivity of 10% (95% vs. 85%) with 1040 participants. If the sensitivity of tests was slightly lower in this population (91% vs. 80%) we would have 80% power to detect this difference with the same sample size.

Statistical analysis

The diagnostic performance of each index test was investigated by presenting 2 × 2 tables and calculating the sensitivity, specificity, positive predictive value and negative predictive value with 95% confidence intervals. Comparative test accuracy was assessed by calculating the difference in sensitivity and specificity, presenting 95% confidence intervals and using McNemar’s test.

The primary analysis compared the sensitivity and specificity between the CAPTURE screening questionnaire and peak flow meter, as this combination has been previously developed to be more relevant for low-resource settings. Secondary analyses compared the comparative performance of all other individual index tests, as well as likely test combinations. The test combinations aimed to maximize specificity (positive result on both index tests), thus in future optimizing efficiency by limiting the number of people requiring diagnostic spirometry. Where it was necessary to for our health economic analyses, test accuracy assessment was based on balancing test sensitivity and specificity, although within the context of the low-resource setting in Brazil, specificity was prioritized where the balance was not clear cut. All analyses were conducted in Stata v16 (Windows Stata—Stata Corp LLC™)

Economic analysis

We conducted a cost-effectiveness analysis to calculate the cost per additional true case detected. The combinations were ordered by the number of true cases detected, from least to greatest, and the principle of dominance applied to eliminate redundant combinations from the analysis (where they were more costly and less effective). Each test was then compared with the next best alternative. For the purpose of this paper, we compared test combinations, rather than individual screening tests.

The unit costs and quantity of any equipment, medication and consumables required, staff time (and salary costs) to deliver each individual test and use of facilities were determined to calculate the health-care costs of delivering each screening combination. Each individual test was timed at a sample of assessment clinics to estimate an overall mean time and range for each test. Equipment costs were depreciated (at 3.5% a year) over the estimated lifespan of the equipment (ranging from 1 to 5 years). Cost per patient visit was calculated assuming the equipment would be used for 750 patients per clinic per year. It was also assumed that true and false positive cases would require GP reassessment, confirmation with quality diagnostic spirometry (assuming 1,000 patients/year) and a clinical review with a pulmonologist (for true positives). Costs were calculated in UK£ for a price year of 2019 and converted to Brazilian Real (R$) using Purchasing Power Parities (PPP)42 with a conversion rate of 3.29 (Supplementary Methods 4).

The paper was written according to the STARD guidance43 for reporting studies of diagnostic accuracy.

Ethics

This study was carried out in accordance with good clinical practices and was approved by the Ethics Committees of the ABC Medical School, Sao Paulo, Brazil on February 4, 2019 (no. 3.131.048) and the University of Birmingham, Birmingham, UK (ERN_18-1185).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

留言 (0)

沒有登入
gif