Stability of healthcare quality measures for maternal and child services: Analysis of the continuous service provision assessment of health facilities in Senegal, 2012–2018

INTRODUCTION

The premise of universal health coverage (UHC), enshrined in the Sustainable Development Goals and adopted as national policy by many countries, is essential health service delivery with financial protection. Health systems must be high quality for service coverage to yield better population health outcomes [1, 2]. Given the many demands on health systems even before the ongoing Covid-19 pandemic, measurement of quality to monitor service delivery and inform health system interventions should be streamlined to optimise utility for policy while minimising cost and burden [3]. For instance, information intended to prompt immediate action such as medication stock-out should be assessed and transmitted in real time; data intended for periodic benchmarking and comparison should reliably distinguish between levels of quality for the timespan of intended use [4]. Unfortunately, quality measurement at present is fragmented, with use of assessment tools poorly fit for purpose [5, 6].

Health facility assessments are infrequent but typically detailed surveys of service availability, readiness and quality of care that can provide unique value beyond more frequent methods such as routine health information systems [7]. In low- and middle-income countries, such assessments have been used to benchmark health service availability and readiness [8], to compare quality of care across countries [9], to identify better and worse performers [10] and increasingly in assessment of effective coverage (the “fraction of potential health gain actually delivered through the health system to the population in need” [11]) as a composite metric of health system performance [12-14]. These uses often extend to years past the date of data collection. In particular, estimates of effective coverage may be calculated from health facility assessments and population data measured at different points in time; limited empirical evidence is available to assess the longevity of information from health facility assessments to support this practice [15]. Use of quality measures for description and benchmarking, on their own or as part of effective coverage measures, requires stability in the actual value of the measure from time of measurement to when inference is made; use for identification of better or worse performers relies on stability in classifications over time. If such information is relatively consistent over time, health facility assessments conducted sporadically would remain useful. In contrast, instability in such measures would imply that the data must be used rapidly for it to have value. Stability assessments testing the consistency of a measure over repeated application have been conducted for individual and facility-level quality of care metrics in the United States [16-19]. A recent study in South Africa found that laboratory measures of HIV care outcomes showed fairly high reliability year to year across all facility types [20]. Similar methods have yet to be applied to nationally representative health facility assessments.

Measurement stability may vary by type of quality measure. Health facility assessments typically produce measures of structure, or inputs to care, and those with clinical observations also generate measures of processes of care [21, 22]. While the process of care is recognised as more proximal to patient outcomes than inputs alone [5, 23, 24], measuring processes is more complex than itemising inputs. Direct observation requires selection and observation of patients, resulting in sampling error and potentially observer bias [25, 26]. Given calls to increase the use of process measures [5, 6, 24], understanding the stability of both types of measures can inform assessment approaches.

The state of health system measurement in Senegal provides a unique opportunity to assess stability of quality measures. A health facility assessment has been conducted annually in Senegal since late 2012, capturing standardised information on a representative sample of the entire health system. At the same time, the government of Senegal has embedded the steps towards UHC in the Plan National de Développement Sanitaire et Social (PNDSS) 2019–2028 [27, 28] and is actively working to take a health systems approach, including a shift away from donor-supported community campaigns towards facility-based services [29]. Maternal and child health services are a priority, with efforts to improve newborn and under-5 survival emphasised given limited gains in these metrics since 2013 [27].

The aims of this analysis are to define structural and process measures relevant to quality of care for maternal and child health across the continuous SPA surveys, to assess stability of such measures at the facility and sub-national levels based on assessments conducted 2 years apart and to quantify the effect of any instability on effective coverage measures.

METHODS Data sources

We used data from the Service Provision Assessment (SPA), a standardised cross-sectional survey designed to measure health service availability, readiness and quality [30]. The SPA is implemented in Senegal by the National Statistics and Demographics Agency (ANSD in French) and the Demographic and Health Surveys (DHS) Program [31]. We pooled data from 6 surveys conducted from 2012/2013 to 2018(Record checklist available in Appendix 1).

We used the household surveys conducted annually by ANSD and the DHS Program from 2015 to 2019 to define the populations in need and utilisation of care as part of the calculations of quality-adjusted coverage [32].

Sampling

Health services in Senegal are organised within the 14 regions and are offered by hospitals, health centres, health posts staffed by 2 salaried providers and health huts staffed by volunteer community health workers [29]. For the continuous SPA, a master facility list including 1578 hospitals, health centres and health posts was used to select a nationally representative sample of facilities stratified by region and facility type for each of the first 4 years of the survey. Sampling fractions were 50% for hospitals and health centres and 20% for health posts each year; hospitals and health centres not sampled in years 1 and 3 would be sampled in years 2 and 4. Survey implementers thus aimed to resample 100% of hospitals and health centres as well as 30% of the health posts every other year. In 2017, the sampling frame was updated, and subsequent samples were drawn independently. ANSD provided the sampling frames with facility IDs to identify facilities assessed more than once. For 2017, only facility name was available: some facilities assessed in this year may not have been linked due to differences in recorded facility name. Our analytic dataset consists of all pairs of assessments matched for the same facility 2 years apart. We used the classification from the sampling frame in 2 cases where facility tier conflicted in the survey data.

Patients were selected for observation using systematic random sampling from a list of those presenting for services on the day of the visit. Survey data include sampling weights for each facility, provider and client visit.

The DHS household surveys used multistage sampling, first selecting census districts by probability proportional to size and then households with systematic random sampling. Women 15 to 49 years old were approached and interviewed [33]. These surveys are intended to provide representative estimates within 4 major sub-national regions: West (Dakar and Thiès regions), Centre (Diourbel, Fatick, Kaffrine, Kaolack regions), North (Louga, Matam, and Saint-Louis regions) and South (Kédogou, Kolda, Sédhiou, Tambacounda, and Ziguinchor regions).

Measures

We included SPA data from the facility audit of service availability and readiness, from provider interviews, and from direct observations of curative consultations for children under 5 years (all survey waves) as well as of antenatal care (ANC) visits (conducted only during 2014, 2016, and 2018 surveys). We extracted covariates to characterise the sample and consider continuity of providers within facilities over time: location, facility tier, ownership type, number and tenure of staff providing care for the maternal and child health (MCH) services of interest – ANC, delivery and/or curative care for sick children – and number of observations conducted during the assessment. Provider interview data were weighed using within-facility sampling weights in summarising to facilities.

We analysed measures of structural and process quality for maternal and child health services. For structural quality, we defined service readiness scores based on the WHO Service Availability and Readiness Assessment [34] for general services (48 items) and for ANC, basic obstetric care and care for children (9, 19 and 18 items respectively; Table S1). These scores cover core domains of readiness: infrastructure, basic equipment, infection prevention, diagnostics and essential medications for general service readiness, and staff and guidelines, basic equipment, diagnostics (as applicable) and medications for each service.

For process quality, we defined adherence to global guidelines: the 2016 WHO-focused ANC model [35] and the 2014 chart book for Integrated Management of Childhood Illness [36]. Visits were scored as the proportion of actions completed out of items assessed (27 for ANC and 25 for child visits, Table S1; actions in ANC follow-up visits were weighted 1/3, 2/3 or 1 based on recommended frequency for ANC visits 2 through 4). Facilities were assigned the average score of visits observed for each service, weighted by the within-facility sampling weight. For delivery care, we used proportion of the 7 signal functions for Basic Emergency Obstetric and Newborn Care (BEmONC) reported as performed in the prior 3 months. We considered each quality measure as a continuous score from 0 to 1. To assess stability of classifications of better, average and worse performers, we classified scores into tertiles of the observed range separately for each year of the paired assessments of the same facilities.

To assess stability at the subnational level, we used the 4 major regions defined by the DHS survey (West, Centre, North, South). We averaged scores based on all facilities or all direct observations of care in each of the major regions within the analytic sample of paired assessments of the same facilities, yielding 4 sets of regional measures (2012/2013 with 2015, 2014 with 2016, 2015 with 2017 and 2016 with 2018). Observations were weighted based on the facility or client weight when averaging.

To test the impact of instability in quality measures on effective coverage, we considered 2 steps of the effective coverage cascade: input-adjusted coverage (“the proportion of the population in need who come into contact with a health service that is ready to provide care”) based on structural quality measures and quality-adjusted coverage (“the proportion of the population in need who come into contact with a service that is ready and that receives the service according to quality-of-care standards” [24]) based on process quality measures. We first defined contact coverage from the DHS surveys for ANC 1 and 4, delivery and child health. For maternal health, the population in need was women with a live birth in the past 2 years, while for child health, this was children under 5 with fever, diarrhoea or respiratory symptoms in the past 2 weeks. Use of care was reported use of a hospital, health centre or health post (minimum 1 and 4 visits for ANC 1 and 4, respectively). We grouped reported facility type as public hospital, public health centre, public health post and any private facility to enable linkage to SPA data; women reporting multiple sources of ANC were assigned the highest reported facility type in the order listed. Because the analytic sample includes only these facility types, use of health huts or unclassified facilities was defined as not receiving services. The resulting estimates are intended only to inform consideration of measurement stability over time.

We calculated adjusted coverage as the product of contact coverage and quality. We defined the reference period for health facility assessments as 1 year prior for maternal health (women interviewed in 2018 regarding pregnancies in the past 2 years would be matched to the 2017 SPA data) and in the same year for child health. We calculated adjusted coverage nationally and within major region, accounting for type of facility and incorporating stratified sampling and survey weights (women's weight in DHS data, facility weight in SPA data) following established procedures [37, 38]. Adjusted coverage measures are based on sample facilities assessed during the reference period and again using the same facilities assessed 2 years earlier. We limited analysis to estimates for regions that had at least one facility of each type that DHS respondents reported using.

Analysis

In this analysis, we (1) describe the analytic sample of paired assessments to explain the basis of analysis and how it differs from a full SPA and (2) assess stability over 2 years for facilities, regions and in adjusted coverage estimation. We used descriptive statistics to summarise the pooled sample of hospitals, health centres and health posts assessed between 2012/2013 and 2018. We compared facilities included in the repeated assessment to those excluded using descriptive statistics and, for continuous characteristics, ANOVA tests incorporating clustering due to repeated inclusion of facilities assessed three times. This comparison is unweighted; pooled data and the analytic sample of repeated facilities are not representative of a predefined target population.

To define stability of quality measures for description and benchmarking, we assessed continuous quality measures. We quantified the magnitude of change in quality measures as the absolute value of the difference between time points for each facility; to capture overall (net) difference, we averaged observed difference by facility tier and by region. We calculated Pearson correlation coefficients for quality scores for facilities and regions at measured 2 years apart (negligible <0.10, weak 0.10–0.39, moderate 0.40–0.69, strong 0.70–0.89 and very strong 0.90–1.00) [39]. Facility-level correlations are plotted, with random jitter added for illustration purposes for ANC readiness and proportion of BEmONC signal functions. An overlaid ellipse outlines where 80% of the data would lie assuming bivariate normality and signals degree of correlation: a circle indicates no correlation, a tighter diagonal ellipse high correlation. Facility-level correlation statistics are weighted by the facility sampling weight from the earlier wave of each pair; we conducted sensitivity analysis using the weights from the later wave.

To quantify the stability of quality measures used to identify relatively better or worse performing facilities, we calculated percent agreement and Cohen's kappa for categorical measures.

To assess the impact of instability in quality measures on adjusted coverage of health services, we calculated differences for each adjusted coverage metric (detailed in Table S2). We plotted and summarised differences for national and sub-national estimates resulting from using measures of the same facilities 2 years before the reference period.

Analyses were conducted in R and Stata.

Ethics statement

The Harvard University Human Research Protection Program approved this secondary analysis as exempt; the original survey implementers obtained ethical approvals for data collection.

Patient and public involvement

It was not appropriate to involve patients or the public in the design, or conduct, or reporting or dissemination plans of this research.

RESULTS Analytic sample

From the 2012/2013 survey through 2018, 16% to 24% of hospitals, health centres, and health posts on the master facility list were successfully assessed, leading to 2208 completed assessments (Figure 1, Table S3). We identified 628 paired assessments of the same facilities that form the analytic sample.

image

Schematic of SPA sampling and creation of analytic dataset

Table 1 compares the characteristics of the 1120 assessments that took place in facilities sampled once to the 1088 assessments in the 460 facilities sampled more than once. The overrepresentation of higher level facilities in the resampling approach results in a sample of facilities that are larger and more urban, with more providers and visits observed for ANC and curative care for children, and higher service readiness and performance of BEmONC signal functions. The average tenure of providers is comparable and the proportion of providers assigned directly is higher, suggesting that retention of providers over 2 years should be at least as high in the repeated sample as the facilities assessed once.

TABLE 1. Characteristics of assessments conducted in facilities assessed one time and those assessed repeatedly One-time Repeated Total (N = 1120) (N = 1088) (N = 2208) Facility location Rural 622 (56%) 423 (39%) 1045 (47%) Urban 498 (44%) 665 (61%) 1163 (53%) Facility type Hospital 34 (3%) 173 (16%) 207 (9%) Health centre 53 (5%) 335 (31%) 388 (18%) Health post 1033 (92%) 580 (53%) 1613 (73%) Facility managing authority Public 922 (82%) 848 (78%) 1770 (80%) Private 198 (18%) 240 (22%) 438 (20%) 4 major regions West 273 (24%) 350 (32%) 623 (28%) Centre 294 (26%) 231 (21%) 525 (24%) North 224 (20%) 185 (17%) 409 (19%) South 329 (29%) 322 (30%) 651 (29%) One-time Repeated Total (N = 1120) (N = 1088) (N = 2208) Mean ± SD Mean ± SD Mean ± SD p-value Facility characteristics MCH staff N 3.07 ± 2.02 4.39 ± 3.85 3.72 ± 3.13 <0.001 Tenure in facility 7.55 ± 5.75 7.31 ± 5.54 7.43 ± 5.65 0.597 Proportion with tenure >2 years 0.66 ± 0.34 0.64 ± 0.32 0.65 ± 0.33 0.195 Proportion assigned directly to facility 0.58 ± 0.35 0.63 ± 0.34 0.61 ± 0.34 0.013 Observed in ANC visits (N = 796 facility assessments) 1.02 ± 0.15 1.10 ± 0.34 1.06 ± 0.27 <0.001 Observed in child visits (N = 1827 facility assessments) 1.03 ± 0.18 1.08 ± 0.31 1.06 ± 0.26 <0.001 Observations ANC visits (N = 796 facility assessments) 2.94 ± 1.76 3.59 ± 2.12 3.29 ± 1.99 <0.001 Child visits (N = 1827 facility assessments) 3.32 ± 1.74 3.89 ± 1.97 3.60 ± 1.88 <0.001 Quality measures Service readiness: general 0.64 ± 0.11 0.67 ± 0.12 0.66 ± 0.11 <0.001 Service readiness: antenatal care 0.72 ± 0.14 0.73 ± 0.15 0.72 ± 0.15 0.036 Service readiness: basic obstetric care 0.60 ± 0.17 0.63 ± 0.17 0.61 ± 0.17 <0.001 Service readiness: preventive & curative care for children 0.65 ± 0.17 0.65 ± 0.19 0.65 ± 0.18 0.410 Adherence to guidelines, antenatal care 0.61 ± 0.13 0.60 ± 0.12 0.60 ± 0.12 0.809 Proportion of BEmONC signal functions (out of 7) 0.70 ± 0.20 0.76 ± 0.21 0.73 ± 0.21 <0.001 Adherence to guidelines, curative care for children <5 0.37 ± 0.12 0.36 ± 0.12 0.36 ± 0.12 0.342 Abbreviations: BEmONC, Basic Emergency Obstetric and Newborn Care; MCH, Maternal and child health; SD, Standard deviation.

Table 2 details the analytic sample of the 628 paired assessments by facility type and quality measure. The sample size is smallest for process quality of ANC (228 paired assessments). A median of five child visits and 3 to 4 ANC visits were observed across the sample; fewer visits were observed in health posts than health centres and hospitals, and the number of visits declined in the second wave of each paired assessment. A median of 1 provider was observed for each service and all facility types (data not shown).

TABLE 2. Number of facilities and number of visits in analytic sample Hospital Health centre Health post Total (N = 104) (N = 211) (N = 313) (N = 628) Facilities with matched assessments by quality measure Service readiness: general 104 211 313 628 Service readiness: antenatal care 83 164 275 522 Service readiness: basic obstetric care 85 147 251 483 Service readiness: preventive & curative care for children 87 198 298 583 Adherence to guidelines, antenatal care 33 82 113 228 Proportion of BEmONC signal functions 85 147 251 483 Adherence to guidelines, curative care for children 66 175 252 493 Hospital Health centre Health post Total Visits observed Median (IQR) Median (IQR) Median (IQR) Median (IQR) Antenatal care, wave 1 5.0 (3.0, 5.0) 5.0 (4.0, 5.0) 3.0 (2.0, 5.0) 4.0 (2.0, 5.0) Antenatal care, wave 2 3.0 (2.0, 5.0) 4.0 (2.0, 5.0) 2.0 (1.0, 5.0) 3.0 (2.0, 5.0) Curative care for children, wave 1 5.0 (5.0, 5.0) 5.0 (4.0, 5.0) 4.0 (2.0, 5.0) 5.0 (3.0, 5.0) Curative care for children, wave 2 5.0 (4.0, 5.0) 5.0 (3.0, 5.0) 3.0 (2.0, 5.0) 5.0 (2.0, 5.0) Stability of quality measures within facilities

Individual facilities changed substantially for each measure over 2 years: the absolute value of the difference for a given facility over 2 years ranged from 0.08 (general service readiness) to 0.18 (BEmONC signal functions) on average, with some facilities changing as much as ±0.50 out of 1.0 (Figure 2). The magnitude of difference was generally similar across health facility types (Table S4). Net change was generally positive and modest: 0.01 to 0.05 linear difference on average.

image

Difference in quality measures between time points

Structural quality measures were weakly to moderately correlated over 2 years, more strongly so in higher level facilities and particularly for general readiness and readiness for child care services (Figure 3). Across all facilities, correlation was moderate for these measures but weak for ANC readiness and basic obstetric readiness (Table 3A). Correlation for general service readiness was driven largely by domains of basic amenities (correlation 0.68) and medications, particularly in hospitals (correlation 0.56, 0.70 in hospitals), while basic equipment and infection prevention domains were less correlated (Table S5). Readiness was less correlated for ANC and basic obstetrics compared to child services within each domain, with the largest difference – correlation of 0.0 for equipment in ANC compared to over 0.50 in the other services – notably in a domain where there is only 1 item for ANC.

image

Correlation of structural quality measures for individual facilities measured 2 years apart

TABLE 3. Correlation of quality of care over 2 years A. Facilities Total observations Correlation Percent agreement (tertiles) Cohen's Kappa (tertiles) Structural quality measures General 628 0.60 55% 0.33 Antenatal care 522 0.34 46% 0.18 Basic obstetrics 483 0.28 45% 0.18 Care for children 583 0.68 51% 0.26 Process quality measures Antenatal care 228 0.24 37% 0.06 Basic obstetrics 483 0.35 53% 0.25 Care for children 493 0.05 36% 0.04 B. Regions N Correlation

Facilities per region

Median

Visits observed per region

Median

Structural quality measures General 16 0.52 36.5 NA Antenatal care 16 0.57 32.5 NA Basic obstetrics 16 0.67 31.0 NA Care for children 16 0.86 34.0 NA Process quality measures Antenatal care

留言 (0)

沒有登入
gif