Long-term probabilistic forecasts of activity mitigation in English hospitals: a national elicitation exercise providing an outside view based on judgements of experts in support of the New Hospital Programme

STRENGTHS AND LIMITATIONS OF THIS STUDY

Long-term forecasts of hospital activity often use point estimates leading to common pitfalls such as flaw-of-averages, base-rate neglect and overoptimism, typically stemming from an inside view that emphasises the specific details of a particular project as seen by local hospital teams.

To address these limitations, we sought to obtain an outside view by eliciting probabilistic forecasts for England, rather than focusing on any single hospital.

We adapted an evidence-based elicitation protocol to conduct a rapid online elicitation exercise involving 17 experts.

These experts provided forecasts for 77 types of hospital activities that could potentially be mitigated, which were aggregated to provide an outside view.

How stakeholders in the New Hospital Programme in England use these aggregated outside views alongside inside views needs further study.

Introduction

Hospitals are a cornerstone of healthcare systems worldwide. In the English National Health Service (NHS), there are approximately 515 hospitals,1 many of which face structural challenges due to ageing infrastructure.2 This has prompted the government to launch the New Hospital Programme (NHP), which aims to build 40 new hospitals across England. Additionally, seven hospitals compromised by the use of reinforced aerated autoclaved concrete have also been included in the NHP.3

Building a new hospital is a major investment involving substantial (public) funds. Therefore, accurately forecasting future demand for hospital services is essential. A hospital that is too small risks being overwhelmed and unable to meet the needs of its population, while one that is too large may be underutilised. Either scenario represents poor value for money. Unfortunately, long-term forecasts of hospital activity frequently appear as point estimates, which are prone to pitfalls such as flaw-of-averages and base-rate neglect. The tendency to underweight or ignore distributional information is seen as a major source of error in forecasting.4–6

To support the NHP design process, the Strategy Unit’s analytics team has developed an advanced demand and capacity model, known as the NHP model. This model provides probabilistic forecasts of hospital activity over a 20-year period (see online supplemental file for details). It considers various factors, including demographic shifts, innovations in healthcare delivery (such as virtual clinics and wards) and potential mitigations of hospital activity. While local administrators, clinicians and stakeholders—who are well acquainted with the specific context of each new hospital—set assumptions about future activity mitigation for a given hospital, these inside views can be prone to biases, such as optimism bias, base-rate neglect and strategic misrepresentation. To counteract this, we undertook an elicitation exercise to obtain an outside view4–6 that was framed around England, rather than any specific hospital.

There are several approaches to forecasting future healthcare activity including statistical methods, expert judgements and scenario building.7–9 For long-term forecasts, expert judgement is a widely used approach, which we have adopted here. Our objective was to gather outside perspectives on mitigation by eliciting probabilistic forecasts from human subject matter experts (SMEs). The elicitation process is informed by literature on cognitive biases, project planning and decision analysis as summarised by Hemming et al.10 ‘Expert judgement can be remarkably useful when data are absent or incomplete. However, experts can also make mistakes. This is often due to a range of cognitive biases such as anchoring, availability and representativeness, groupthink, overconfidence and difficulties associated with communicating knowledge in numbers and probabilities. Inappropriate and ill-informed methods for elicitation can amplify these biases. Well-designed, structured elicitation protocols can enhance the quality of expert judgements. These protocols treat each step of the elicitation as a process of formal data acquisition, and incorporate research from mathematics, psychology and decision theory to help reduce the influence of biases and to enhance the transparency, accuracy and defensibility of the resulting judgements’.10

This paper describes an elicitation exercise undertaken to obtain probabilistic forecasts about mitigation, from an outside perspective, of hospital activity across England over a 20-year time horizon. We asked SMEs to provide a probabilistic forecast for 77 types of hospital activity that might be mitigated in the future. The aggregated forecasts from SMEs represent an outside view of mitigation in the form of low to high probabilistic forecasts with an 80% degree of belief, where low and high equate to the 10th and 90th percentiles of an assumed normal distribution. We denote this as the P10–P90 prediction interval or P10–P90 interval.

MethodsPatient and public involvement

Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.

Participants

We envisaged that SMEs taking part in this study (which had two rounds of data collection described later) would have some of the following characteristics.

Clinical or non-clinical expertise in the subgroup domain.

Expertise by experience of working in that domain.

Typically, be NHS-based staff or academia/research or government.

Consent to participate on a voluntary basis.

Agree to follow the elicitation protocol.

Have an appetite for making probabilistic forecasts.

Participants were invited to take part in the exercise via emails which were sent in Septmeber 2023 by the NHP lead for Engagement to about 700 people working in the NHS. Participation was voluntary, confidential and required consent with the option to withdraw at any time. From an initial expression of interest of 136 people, 87 consented to take part and 29 completed round 1 of the exercise (see the ‘Acknowledgements’) and 18 completed round 2 of the exercise. Given the tight timelines involved in this potentially demanding exercise alongside workload pressures in the NHS, it was anticipated that most SMEs would not have adequate time to complete the exercise. The main reason for drop-out was lost to follow-up or lack of time. Several (only 1 of the 18 in round 2) SMEs indicated that they were employed by an NHS hospital Trust which was part of the NHP. This was not deemed to be a conflict of interest that ruled out participation in the study especially as this elicitation exercise was at the aggregate England level and not focused on any specific hospital. Our primary analysis is based on round 2 data from 18 SMEs.

NHP activity model overview

The NHP activity model has five major input categories to predict activity over 20 years (see online supplemental file for an overview). One of these categories relates to mitigation of future activity which is the focus of this exercise and is described below.

Future activity mitigation

This elicitation exercise is focused on ‘future activity mitigation’, which consists of 77 parameters as summarised in table 1, which shows groups of hospital activity by five types of mitigation. The number in each cell of the table is the count of the number of parameters that require a probabilistic forecast. 1

The specific types of activity amenable to mitigation (n=77) included in the NHP model were identified over a prolonged period within the Strategy Unit drawing on experience and knowledge of strategies and interventions that have commonly been implemented in healthcare. The interventions included were those that were intended to reduce, primarily through avoidance, prevention, displacement, quality improvement or de-adoption, the volume of a given type of activity delivered within acute hospital settings. Algorithms were developed to identify specific cohorts of patients from Hospital Episode Statistics datasets that were the focus of such interventions drawing on published evidence and other relevant documents where available.11 12 For each type of activity, a historic trend graph was provided at the England level (not for any specific hospital), with age-sex standardisation where appropriate, with 2019 as the baseline year in a specifically designed online app for this national elicitation exercise (NEE) (see online supplemental file).

Elicitation protocol

We based our elicitation protocol on the Stanford Research Institute protocol which has five broad steps: motivate, structure, condition, encode and verify, as described in the Handbook of Decision Analysis13 and the Investigate, Discuss, Estimate, Aggregate protocol as described by Hemming et al10 adapted to better suit our needs. The elicitation protocol is designed to mitigate well-known cognitive biases that usually lead to overconfidence in the judgements of experts. The timelines and scale of the NHP led us to undertake an online elicitation exercise with supporting materials that did not involve interviews with SMEs.

To support SMEs in their task, we developed three short online training videos (https://vimeo.com/showcase/nhpnee), viz:

Part 1 (~3 min) provided an overview of the NHP model and explained why we needed the support of experts to make forecasts.

Part 2 (~9 min) provided SMEs with training on probabilistic forecasting.

Part 3 (~4 min) showed SME how to use the online data collection.

SMEs who consented were assigned a unique code, were sent the links to the videos and given 3 weeks to complete round 1 of the elicitation exercise and then a further 2 weeks to complete round 2.

SMEs were required to use a slider to provide P10 and P90 probability interval forecasts with an 80% degree of belief. The slider was designed to go from 0% (no reduction in activity) to 100% (total reduction in activity). The concept of the P10 and P90 interval was explained by using the qualitative terms ‘surprisingly low’ and ‘surprisingly high’ to denote the P10 and P90, respectively, in the second online video. The time horizon for all judgements was 20 years from the baseline year of 2019. This ensures that the elicitation frame predates the start of the COVID-19 pandemic (31 December 2019). The online tool enabled SMEs to see the potential impact of their P10 and P90 intervals on the parameter of interest from previous activity (as shown over time) using a linear trajectory. The above steps were repeated for each parameter. SMEs could contact the study lead at any time via email.

In round 1, SMEs were given up to 2 weeks (extended by a further week because of time constraints) to complete their assignment for their selected areas of expertise. In round 2, SMEs were given up to 2 weeks to compare their forecasts with those of their peers and make any changes before the final closing date. The instructions on the home page of the app are shown in an online supplemental file.

Data processing and analysis

All SMEs were de-identified and assigned a unique code which could not be linked back to the SME. Since the default P10 and P90 value in the app were 0%–100%, respectively, we excluded all such values from the data set. We also excluded values where the P10 and P90 were equal because they were deemed to be point estimates.

We derived the mean and SD for each SME using their P10 and P90 values by assuming a parent standard normal distribution (with mu=0, and SD=1), where the SME mean (=(P10+P90)/2) and SME sigma is based on the quantile of the parent normal distribution (sigma=(P90−mean)/qnorm (0.9, mu=0, SD=1). The mean and sigma values for each SME were then input parameters into a (child) truncated normal distribution with a minimum value of 0% (no activity avoided) and maximum value of 100% (all activity avoided).

These individual truncated distributions were aggregated for each of the 77 hospital activities by bootstrapping. If n experts provided forecasts for a specific activity, then (100 000/n) values were sampled at random from each expert’s truncated Normal distribution. The resulting 100 000 values described the aggregated expert view distribution with each expert’s view carrying equal weight. These aggregated expert view distributions are presented as mean and percentile plots (P10–P90) in the results. The individual and aggregate forecasts under each group of hospital activity are presented in tabular and graphical format in the order indicated in table 1, with a summary of the rationales supplied by SMEs. Aggregate forecast intervals are summarised by means and P10–P90 intervals. The means reflect the optimism or pessimism of forecasters. The greater the mean the greater the optimism, the lower the mean the more pessimism. The wider the P10–P90 interval the greater the degree of uncertainty. Aggregate results are reported below. Parameter-specific results are shown in more detail in an online supplemental file.

Table 1

Groups of hospital activity and types of mitigation

Results

Our primary analysis is based on round 2 data, where 18 SMEs provided a total of 736 P10–P90 forecasts based on their selected areas of interest. Most of the SMEs had a clinical background (see list of acknowledgements).

About 10% (=79/736) of forecasts were excluded because they had a zero range (where the P10=P90=0%, n=5) or the range was 100% (where P10=0, P90=100%, n=74). After exclusions, we had 657 P10–P90 forecasts from 17 SMEs, where the median number of forecasts per SME was 42 (min 2, lower quartile 16, upper quartile 61, max 77).

Aggregate forecasts

Figure 1 shows an overview of the aggregate forecast for each of the 77 types of hospital activity across 5 types of mitigation. SMEs highlighted mainly four types of mitigation mechanisms—prevention, displacement, quality improvement and de-adoption.

Figure 1Figure 1Figure 1

Overview of aggregate probabilistic forecasts, ranging from the 10th percentile to the 90th percentile, for 77 types of hospital activity in 8 strategy groups (see legend) across and 5 types of activity mitigation (last column of graphic). The black dot in each interval is the mean. The vertical grey line is a 50% reference indicator. The colours are to aid the visualisation of the eight different hospital activity groups shown in table 1.

Aggregated ‘surprisingly low’ P10 values (n=77) ranged from 0.43% to 19.82% (mean=4.77 1%), 90% of which were below 10.27%. Aggregated ‘surprisingly high’ P90 values (n=77) ranged from 16.4% to 80.61% (mean=54.85%), 90% of which were above 39.61%. The average width of the forecast intervals was 50.08%.

The most pessimistic forecast was for inpatient avoidance of frail elderly admissions (5.71%, P10=0.43%, P90=16.40%). The most optimistic forecast was for inpatient admission avoidance for vascular surgery (48.27%, P10=19.82%, P90=78.57%).

The overall (n=77) aggregate means ranged from a low of 5.71% to a high of 48.27%. The aggregate means varied by type of mitigation: outpatient attendance avoidance (from 16.42% to 34.12%, n=8); inpatient admission avoidance ranged (from 5.71% to 48.27%, n=31); Accident & Emergency A&E) A&E attendance avoidance (from 18.71% to 32.91%, n=12); outpatient delivery mode (from 16.42% to 34.13%, n=4); inpatient length of stay reduction (from 15.41% to 45.91%, n=22). The aggregate means also varied across the types of hospital activity: hospital activity amenable to psychiatric liaison and community psychiatry (from 13.33% to 18.33%, n=4); hospital activity amenable to public health interventions (from 14.09% to 23.41%, n=6); hospital activity amenable to medicines management (from 12.74% to 29.45%, n=5); hospital activity amenable to primary care and community interventions (from 5.71% to 27.91%, n=14); planned paediatric activity (from 16.42% to 27.86%, n=6); emergency department and acute medicine activity 1 (from 8.71% to 35.26%, n=16); planned medical activity (from 18.69% to 34.13%, n=3); planned surgical activity (from 20.19% to 48.27%, n=23).

Experts highlighted mainly four types of mitigation mechanisms (see online supplemental file)—prevention, displacement, quality improvement and de-adoption.

The boxplots in figure 2 show how the mean aggregate forecasts varied by type of groups of hospital activity. The most optimistic forecasts were for the mitigation of planned surgical and medical activity. The most pessimistic forecasts were hospital activity amenable to psychiatric liaison and community psychiatry and public health interventions.

Figure 2Figure 2Figure 2

Boxplots of mean aggregate forecasts across groups of hospital activity.

The boxplots in figure 3 show the mean aggregate forecasts varied by type of type of mitigation. The most optimistic forecasts were for outpatient delivery mode and inpatient length of stay reductions (albeit with wide variation). The most pessimistic forecasts were for inpatient admission avoidance (albeit with wide variation) and outpatient attendance avoidance.

Figure 3Figure 3Figure 3

Boxplots of mean aggregate forecast across types of mitigation activities. (A&E is Accident & Emergency)

Table 2 summarises the mean aggregate forecasts by activity group and type of mitigation.

Online supplemental file shows more detailed results for each parameter including a synthesis of rationales for forecasts.

Table 2

Minimum and maximum mean aggregate forecasts by hospital activity and type of mitigation

Discussion

We undertook a rapid online exercise to elicit long-term probabilistic forecasts from experts on the extent to which various types of hospital activity might be mitigated in the future. The exercise has provided the NHP with an initial or preliminary set of aggregate probabilistic forecasts which make explicit the distribution of uncertainty in respect of 77 types of hospital activity which, crucially, provide an outside view for England (not a specific hospital).

The overall (n=77) aggregate means ranged from a low of 5.71% to a high of 48.27% with an average width of 50.08%. The aggregate means varied by type of hospital activity and type of mitigation. The most optimistic forecasts were for the mitigation of planned surgical and medical activity. The most pessimistic forecasts were hospital activity amenable to psychiatric liaison and community psychiatry and public health interventions. The most optimistic forecasts were for outpatient delivery mode and inpatient length of stay reductions (although with wide variation). The most pessimistic forecasts were for inpatient admission avoidance (although with wide variation) and outpatient attendance avoidance.

Experts appeared to treat the P10 and P90 values as scenarios where they highlighted four types of mitigation mechanism—prevention, displacement, quality improvement and de-adoption. In most cases, the scenario at P10 included continuing low and unfocused investment in primary and community care whereas the P90 generally assumed the opposite. These insights should be recognised by planners and suggest that estimates of hospital activity mitigation should be accompanied by explicit and credible plans for how investment is planned to change in primary and community settings.

The primary motivation for undertaking this exercise was to avoid the use of point estimates, the flaw of averages and cognitive biases associated especially with inside views in large projects.4–6 Moreover, the inside view, which refers to a perspective that focuses on the specific details and characteristics of a particular project, often yields an overly optimistic outlook due to a range of biases that include optimism bias, base-rate neglect and strategic misrepresentation.6 In contrast, we sought an outside view by asking experts to make forecasts for England (not for a specific hospital). This simple reframing helps to counteract the overoptimism that can characterise the inside view by providing objective and realistic forecasts of uncertainty. This is crucial. As Flyvbjerg states, ‘The comparative advantage of the outside view is most pronounced for non-routine projects, understood as projects that managers and decision-makers in a certain locale or organisation have never attempted before—like building new plants or infrastructure, or catering to new types of demand. It is in the planning of such new efforts that the biases toward optimism and strategic misrepresentation are likely to be largest’.14

This process for elicitation represents a formal data collection exercise which is systematic, transparent and subject to scrutiny and continual improvement. This is in marked contrast with the not uncommon ‘black box’ approach to assumption setting which has attracted criticism.2 10 These national aggregate forecasts provide an outside view against which local forecasts may be evaluated. For instance, where local inside views appear to differ qualitatively from the aggregate forecasts, this may prompt a review of the credibility and plausibility of assumptions set by the local hospital team. Moreover, the use of the NHP model along with outside forecasts supports a more standardised approach to the NHP programme.

This demanding elicitation exercise was undertaken under considerable time constraints and has several limitations. We focused on the 77 mitigation factors that were presently in the current version of the NHP Model. While this is already a considerable number, there may be additional hospital activities or types of mitigation (eg, maternity care) which should be included in future exercises.

Our recruitment criteria were pragmatic and broad. They focused mainly on the appetite of participants to undertake the task by consent and choosing what to forecast, rather than any specific markers of expertise (eg, age, experience, publications, memberships and peer recommendation). The evidence, however, shows that such markers are poor indicators of someone’s ability to provide good forecasts. Indeed, the best approach to selecting experts for forecasting exercises is whether the participant can understand the questions being asked. Moreover, the inability to identify the best expert means that groups of multiple experts almost always perform as well as, or better than, the best-regarded expert(s).10 Some SMEs opted to withdraw from the exercise because of lack of time and/or inability to engage with the exercise leading to a drop in participants at each stage. Although we were expecting a high drop-out of SMEs primarily because of workload pressures the final number of participants is not inconsistent with the recommendation of 10–20 participants and evidence which also notes that only minor improvements in performance are gained by having more than 6–12 participants.10 For 21 of the 77 parameters, the number of SMEs was less than six (range 3–5) and future studies could target recruitment of SMEs towards these parameters.

We sought to maximise the appeal of the task by opting for an online approach because this was likely to minimise the time commitment from SMEs and so maximise the number of SMEs who could participate. We supported our participants with data science tools which used a combination of interactive graphics, numbers and text to enhance understanding of this demanding exercise. This multimodal approach appears to be important in supporting more effective participation of women.15 Nevertheless, the elicitation literature does indicate that the quality of responses is enhanced by engaging with SMEs in interviews or workshops compared with discussions facilitated via email.10 The remote online approach meant that the research team was not in a position to interactively quality assure the contribution of SMEs in the time available. For instance, although SMEs were encouraged to provide a clear rationale for their P10 and P90 forecasts, these were not always forthcoming. A few SMEs found the lower 0% bound problematic because they wanted to show an increase in activity (which is accommodated elsewhere in the model), whereas 0% meant that mitigation was wholly unsuccessful. Some SMEs referred to ageing population despite being asked to discount this (because it is accommodated elsewhere in the model). For some SMEs, such nuances are probably best addressed via an interactive dialogue. A dialogue between SMEs may also lead to less variability in round 2 forecasts. Despite these limitations, most responses from SMEs appeared to follow the elicitation protocol with fidelity (about 10% of the responses were excluded from this analysis). Further research is needed to understand precisely how these aggregate outside views will be used in NHP.

In project planning, integrating both inside and outside views is crucial. While inside views provide detailed project insights, outside views offer valuable external perspectives that help counteract overconfidence and overly optimistic projections. We recommend that outside view elicitation exercises become a standard component of large-scale NHS project planning. This approach would necessarily involve ongoing methodological refinement to address the limitations of this exercise. Nevertheless, the principles applied in this exercise could also be used to strengthen inside views provided by local hospital teams, thereby complementing national efforts and enhancing our collective forecasting knowledge.

Ultimately, while inside views are essential for understanding project specifics, outside views offer critical reality checks, providing a more robust foundation for planning. The aggregate judgements obtained here provide an outside perspective on the uncertainties associated with hospital activity mitigation, offering an opportunity to explore how these views interact with inside views and inform the NHP planning process.

Conclusion

A national elication exercise has produced aggregate forecasts from an outside perspective that makes explicit the variation and uncertainty related to future mitigation activities. This outside view helps to overcome the limitations of point estimates, the flaw-of-averages and the overoptimism often associated with inside views. These aggregate forecasts can now be integrated into the planning process for new hospitals within the English NHS, providing a more robust foundation for planning.

Data availability statement

Data are available on reasonable request. The data and analyses that support the findings of this study are available on request from the corresponding author subject to approval from the funder.

Ethics statementsPatient consent for publicationEthics approval

This study involves human participants and ethical approval was granted by the Chair of the Humanities, Social and Health Sciences Research Ethics Panel at the University of Bradford (EC27944) in August 2023. Participants gave informed consent to participate in the study before taking part.

Acknowledgments

The authors would like to express their gratitude and thanks to everyone who supported this project from the New Hospital Programme, especially the staff within the Transformation Directorate of the New Hospital Programme who supported the recruitment of subject matter experts from their existing stakeholder networks and databases. A special thanks is due the subject matter experts who provided probabilistic forecasts for this exercise (see below): James Butcher, Operations Lead Future System Programme; Mahesh Kotli, Specialty doctor in Oral and Maxillofacial surgery; Rehan-Uddin Khan, Regional Lead Gynaecologist; Jeanette Taylor, Matron; Robin David Proctor, Consultant Radiologist; Diane Goodwin, Operations Director – NHP; Matthew Needham, Consultant Intensive Care Medicine & Anaesthesia; Paula Miller, Chief Nursing Projects Officer, New Hospital Programme, James Paget NHS Trust; Nigel Wesley Smyth, Consultant Physician; Mary-Anne Christine Morris, Consultant Paediatrician, Clinical Director NHSE EoE CYP Transformation programme; Helena Margaret Earl, Professor Emeritus of Clinical Cancer Medicine; Stephen Winder, Consultant Ophthalmologist; Robert Selley, Strategy Delivery Director; Elaine Quick, GIRFT Radiology Advisor NHSE PACS Implementation Lead Northern Care Alliance; Donald Richardson, Chief Clinical Information Officer; Nicholas Kennedy, Consultant anaesthetist and Intensivist; Steve Canty, Consultant Trauma & Orthopaedic Surgeon, Divisional Medical Director for Surgery; Yvonne Susan Thackray, Imaging Academy Manager; Emma Jackson, Intensive Care Medicine Consultant; Paul Stevens, Consultant Nephrologist & Medical Examiner, Chair South East Clinical Senate; Richard Graham, Director of Research and Innovation and Consultant Radiologist; Rachel Hoey, Consultant and Divisional Director Emergency Medicine; Josie Harral, Head of Redevelopment Programme Analytics; Jenny Steel, GP & Medical Director Integrated Community Services; Hazel Tonge, Clinical Lead - Building for the Future Team; Robert Hakin, Associate Director Corporate Planning; Michael Barker, Senior Clinical and Strategy Advisor: Transformation Directorate of the New Hospital Programme; Sophie Hargreaves, Director of Strategy; Iona McAllister, Hospital Operations Manager.

留言 (0)

沒有登入
gif