Methods to assess evidence consistency in dose‐response model based network meta‐analysis

gθi,k=22 To illustrate, suppose we were interested in node-splitting for x1,A vs x2,B. A 2-arm study, study 1, which compares x1,A vs x2,B estimates:

δ1,2∼Ndx1,A,x2,BDir,σ2,

whereas a 3-arm study 2 comparing (0,P)vsx3,Avsx4,A contributes to the estimation of the dose-response curve for agent A, estimating:

δ2,2δ2,3∼N2fx3,βAfx4,βA,σ2σ2/2σ2/2σ2

and a 3-arm study 3 comparing (0,P) vs x5,B vs x6,B contributes to the estimation of the dose-response curve for agent B, estimating:

δ3,2δ3,3∼N2fx5,βBfx5,βB,σ2σ2/2σ2/2σ2.

These can be used together to obtain an indirect estimate:

dx1,A,x2,BInd=fx2,βB−fx1,βA.

Node-splitting requires both direct and indirect evidence to exist for a particular treatment-level contrast. The R-packages gemtc23 and MBNMAdose12 include functions for finding comparisons in the “treatment-level” network (ie, Figure 1B for psoriasis example) where both direct and indirect evidence exist. However, in dose-response MBNMA there may be additional treatment-level contrasts where indirect estimates can be obtained via the dose-response relationship. This is because the dose-response model can allow indirect comparisons to be made via interpolation/extrapolation (Figure 2).

Due to the nature of phase-II studies exploring dose-response, we expect to be able to obtain indirect estimates for many of the treatment-level contrasts where we have direct evidence. However, the extent to which indirect estimates will be available via the dose-response relationship will depend on the availability of remaining dose-response information once direct evidence has been removed, and on the complexity of the dose-response relationship fitted. Where indirect estimates are not available, then the model will simply predict a very uncertain estimate or may fail to converge. The inconsistency.loops() function in MBNMAdose can be used to identify contrasts in which both direct and indirect estimates are available, and to indicate whether it arises from pathways of head-to-head evidence (as for node-splitting in a treatment-level network) or only from the dose-response relationship. It also provides the number of doses of each agent in a comparison on which the indirect dose-response relationship must be estimated, which can help identify the complexity of function that could be used to estimate it.

Node-splitting is complicated by the existence of multiarm trials (trials with three or more arms). A K-arm trial provides (K-1) relative effects, but the choice of the “reference” arm 1, determines whether the trial contributes to the direct estimate or not, and this choice is somewhat arbitrary. For example, suppose that we have a 4-arm study i comparing Placebo with agent A at doses x1 and x2 and agent B at dose x3, and we were interested in node-splitting for x2,A vs x3,B. If we take Placebo as the reference arm and define three relative effects for each active arm compared to Placebo, then the study does not contribute to the direct estimate for dx2,A,x3,BDir and instead informs the indirect estimate dx2,A,x3,BInd. However, if we take the x2,A as the reference arm, then the relative effect for x3,B relative to x2,A contributes to dx2,A,x3,BDir, whereas the relative effects for Placebo and x1,A relative to x2,A could contribute to the dose-response model for the indirect estimate. Van Valkenhoef et al. set out a series of rules for how multiarm trials can be used in node-splitting models for NMA.24 Following these rules and using Placebo as the reference arm, the relative effects for x1,A and x3,B relative to x2,A in the example above would be discarded since a multiarm trial cannot be inconsistent with itself. However, in dose-response MBNMA there may be inconsistencies within a multiarm trial due to departures from the assumed dose-response relationship. We therefore propose that the following approach be taken for inclusion of multiarm trials in node-splitting for dose-response MBNMA. If the multiarm trial contains the contrast xc,ac vs xk,ak that is being split, then reorder the arms with xc,ac as the reference arm 1 and xk,ak is arm 2, so that the multiarm trial always contributes to dxc,ac,xk,akDir if possible. Model all the relative effects from the multiarm trial as usual, but replace the relative effect for δi,2 with dxc,ac,xk,akDir

δi,2δi,3⋮δi,K∼N(K−1)dxc,ac,xk,akDirfxi,3,βai,3−fxi,1,βai,1⋮fxi,K,βai,K−fxi,1,βai,1,σ2σ2/2⋯σ2/2σ2/2σ2σ2/2⋮⋱⋮σ2/2σ2/2⋯σ2.(8)

The covariance structure reflects the fact that the multiarm trial cannot be inconsistent with itself, the trial contributes to the direct estimate for the contrast that is split, and the remainder of the trial arms contribute to the estimation of the dose-response MBNMA model which is used to form the indirect estimate. 3.4 Implementation

The models are implemented using the MBNMAdose package version 0.3.112 in R version 4.02, which uses Bayesian Markov chain Monte Carlo (MCMC) simulation in JAGS.25 R code for all analyses can be found on GitHub (https://github.com/hugaped/Dose-response-MBNMA-consistency). We illustrate a global assessment of consistency and methods for node-splitting at the treatment-level in the psoriasis dataset. We focus our analysis on two different dose-response functions, to demonstrate the impacts of a good fitting (Emax) and a badly fitting (exponential) dose-response model. We also include results for a simple linear dose-response model (very badly fitting), as readers will be most familiar with this. For global testing of inconsistency using UME, we also fit an agent-level NMA that lumps different doses of agents together assuming a common effect. Common and random treatment effect models were fitted and compared.

Vague normal prior distributions (N(0,1000)) were given to direct treatment parameters dxc,ac,xk,akDir, nuisance parameters μi, and dose-response parameter λai,k for exponential MBNMA models. For Emax MBNMA models, a correlation was modeled between dose-response parameters by assigning them a multivariate normal prior:

Emaxai,klogED50ai,k∼MVN(0,∑).(9)

For ED50ai,k parameters it was necessary to ensure that they only took positive values so prior distributions were specified on the log-scale. ∑ is the covariance matrix, and its inverse (the precision matrix) was assigned a minimally informative Wishart distribution as a prior, with 2° of freedom and a scale matrix Ω=1001 (∑−1∼Wishart(Ω,2)). The between-study SD, σ, was given a uniform prior distribution (U(0,5)). Sensitivity of the results to the choice of prior distributions are reported in Supplementary Materials.

Models were run for 50 000 burn-in iterations and monitored for 50 000 iterations on 4 chains. Trace plots, autocorrelation and Brooks–Gelman–Rubin plots were used to assess convergence.26 Unless otherwise stated, results are presented as posterior medians and 95% credible intervals (95% CrI).

All models were compared using DIC, with a difference of three points or greater being considered as meaningful. DIC was calculated using the Kullback–Leibler divergence for estimation of pD.27

4 RESULTS

Model fit statistics for all models are reported in Table 1. Heterogeneity was found to be very low when fitting a treatment-level NMA, leading to the selection of a common effect model based on DIC. MCMC chains struggled to converge for between-study SD in the random effect Emax MBNMA, however model fit statistics did converge and indicated that the common effects model gave a lower DIC and model fit was adequate (posterior mean residual deviance 75.6 compared with 70 data points). Lumping doses at the agent-level or using an exponential or linear dose-response resulted in high heterogeneity and the selection of random treatment effect models in all three instances.

TABLE 1. Model fit statistics for all models performed on the psoriasis network Model Treatment effect model Posterior mean residual deviancea pDb DICc Between-study SD (95% CrI) Selected (based on DIC) NMA treatment-level Common 79.4 37.7 470.4 - Yes NMA treatment-level Random 70.4 43.9 473.2 0.08 (0.003, 0.24) No NMA agent-level Common 288.8 30 672.2 - No NMA agent-level Random 68.2 59.2 486 0.49 (0.37, 0.66) Yes MBNMA linear Common 1318.3 30 1702.3 - No MBNMA linear Random 70.7 68.7 497.9 1.41 (1.12, 1.82) Yes MBNMA exponential Common 285.4 30.8 669.3 - No MBNMA exponential Random 68.4 57.5 484.4 0.49 (0.37, 0.65) Yes MBNMA Emax Common 75.6 35.4 469.7 - Yes MBNMA Emaxd Random 71.5 41.5 471.7 - No UME treatment-level Common 80.2 43.2 476.7 - - UME treatment-level Random 70.9 47.2 476.7 0.10 (0.002, 0.26) - a Compared to 70 data points. b pD: The effective number of parameters calculated using the Kullback–Leibler divergence.27 c DIC: Deviance information criterion = pD + residual deviance (Dbar). d Model did not fully converge for all parameters.

Among selected models, DIC was higher in those with higher heterogeneity due to the additional number of effective parameters. Both the MBNMA Emax and treatment-level NMA models had similar DIC which was lower than for other models fitted. The slightly lower residual deviance in the treatment-level NMA model was offset by the fewer parameters required in the Emax MBNMA model. This suggested that the Emax dose-response relationship gave an adequate fit to the data. Predicted probabilities of response for common effects treatment-level NMA, random effects agent-level NMA, random effects exponential MBNMA, random effects linear MBNMA, and common effects Emax MBNMA are shown at different doses for each agent in Figure 5.

SIM-9270-FIG-0005-c Predicted responses for different agents in the network based on results from a common effects treatment-level NMA (thick vertical lines and points), a random effects agent-level NMA (solid horizontal line with red 95% CrI), a common effects Emax MBNMA (dotted curve with blue 95% CrI), a random effects Exponential MBNMA (short dashed curve with green 95% CrI) and a random effects Linear MBNMA (long dashed curve with purple 95% CrI). Predictions were calculated from relative effects applied to a probability of placebo response of 0.05. Predicted responses are shown as posterior medians and 95% CrIs [Colour figure can be viewed at wileyonlinelibrary.com]

Points below the line of equality in the deviance-deviance plots for the common effect treatment-level NMA model against each of the other fitted common effect models indicate lack of fit for the dose-response model assumed. For the Emax MBNMA the deviance-deviance plot showed generally good fit, though there was some lack of fit for studies 15 and 16 that compared different doses of Ustekinumab (Figure 6A). However, they showed poor fit for the exponential MBNMA and the agent-level NMA models, suggesting that these do not explain the dose-response relationship. However, the random effects versions of these models fitted well because the lack of fit is captured as heterogeneity in these models (Figure S2). Deviance-deviance plots are shown separately for common and random effects linear MBNMA models (Figure S3). These showed very similar trend to the exponential MBNMA models.

SIM-9270-FIG-0006-c Deviance-deviance plots showing the contribution to the posterior mean deviance for each data point from different common treatment effect models compared to treatment-level NMA (A) and UME (B) models. Comparisons in (B) are to UME models. For better visibility points which contribute up to 15 to the posterior mean deviance are plotted, meaning that 10 data points that contribute >15 to the posterior mean deviance have been excluded from MBNMA exponential and NMA agent-level plots. Red stars represent data points described in the text corresponding to arm 1 in studies 15 and 16 that compare different doses of Ustekinumab. The red diagonal line represents the line of equality, at which deviance is equal in both consistency and UME models [Colour figure can be viewed at wileyonlinelibrary.com] 4.1 Global assessment of inconsistency

For global assessment of inconsistency, the posterior mean residual deviances were similar for all selected models and the treatment-level UME model that relaxed the consistency assumption (Table 1). However, whilst there was no evidence of inconsistency for the treatment-level NMA and the Emax MBNMA models based on DIC, for the exponential MBNMA and agent-level NMA the DIC were considerably higher than for the UME model, suggestive of inconsistency in these models driven by the high levels of heterogeneity.

Deviance-deviance plots are shown for each model compared to the treatment-level UME (Figure 6B). Whilst the treatment-level NMA model showed no evidence of substantially poorer fit for any data points, most data points in all other models showed considerably higher contributions to the posterior mean deviance compared to the treatment-level UME than when compared to the treatment-level NMA, which may be suggestive of inconsistency (Figure 6A), but may also be due to incorrect specification of the dose-response relationship.

There were two data points in particular in the Emax MBNMA which appeared to have a substantially higher contribution to the posterior mean deviance than in the treatment-level UME. These points correspond to studies comparing placebo with different doses of Ustekinumab (study numbers 15 and 16), which were also identified in Figure 6A as having a poorer fit for the dose-response relationship than other data points.

For the exponential MBNMA and agent-level NMA there were a large number of data points for which deviance contributions were higher when common treatment effects models were compared to a common effects UME model (Figure 6B). However, these differences were no longer discernible when random treatment effects models were compared (Figure S1), since they were masked by high heterogeneity which resulted in lower posterior mean deviances. This could be identified by examining the reduction in between-study SD and DIC for the random effects UME model compared to the random effects exponential MBNMA and treatment-level NMA models (Table 1). For common and random effects Linear MBNMA models, the trend was similar to that observed for common and random effects exponential MBNMA models, but with more extreme differences in deviance contributions from UME models arising from poorer fit of the dose-response function (Figure S3).

Given that there is poor dose-response fit for these models, it is difficult to further confirm if there is also inconsistency in these models. Node-splitting can be used to provide a local test to further explore the potential inconsistency identified.

4.2 Node-splitting

In the psoriasis dataset there was direct evidence for 34 different comparisons. However, by parameterising comparisons in each study as relative effects compared to the study reference arm (such that a 3-arm study is parameterised by two relative effects and a study reference treatment effect) this resulted in 20 different comparisons for which direct evidence was available. Both direct and indirect evidence were available for 4 of these 20 comparisons, meaning that node-splitting was possible for these in both MBNMA and NMA (Table S3 and Figure S4). For these comparisons, indirect estimates made use of evidence from both the consistency relationship via a pathway of direct comparisons, and from the dose-response relationship.

For all four comparisons both Emax MBNMA and treatment-level NMA models showed no evidence of inconsistency between direct and indirect evidence (Figure 7). For three of the comparisons (comparisons 1, 2, and 4) direct and indirect estimates from the Emax MBNMA were more similar than in the treatment-level NMA. However, inconsistency was identified in the exponential and linear MBNMAs for the comparison of Ixekizumab 40 mg/wk vs Ixekizumab 20 mg/wk (comparison 1) (Figures 7 and S5). For the other three comparisons (comparisons 2-4) there was no evidence of inconsistency in the exponential or linear MBNMAs.

SIM-9270-FIG-0007-b

Estimates of direct, indirect and overall evidence for each comparison from each model for which node-splitting was possible. Bayesian p-values are given to the right of the plot, which show the agreement between direct and indirect estimates. Interventions are coded according the first two letters of the agent and dose (mg/week)

A further nine comparisons (comparisons 5-13) made use of indirect evidence that was only available via the dose-response relationship, providing additional potential for node-splitting in MBNMA that was not possible in the treatment-level NMA (Table S3 and Figure S4). For estimating indirect evidence via the dose-response relationship, the number of available doses is critical, as this will limit the complexity of the dose-response function that can be fitted. The number of doses for each agent in the comparison are shown in Table S3. To fit an Emax dose-response function for all agents, we would need at least three doses of each agent available in the indirect evidence. Given that evidence for the direct effect is separated from evidence used to estimate the indirect effect, this reduces the information available to estimate each of the effects. This means that fitting an Emax MBNMA to the indirect evidence is only possible for comparisons 1 to 4 (for which indirect evidence is also available via a pathway of direct comparisons, as discussed above) and additionally for Ustekinumab 5.62 mg/wk vs Ixekizumab 30 mg/wk (comparison 5). For this comparison, an Emax MBNMAs did not show any evidence of inconsistency (Figure 7).

However, if we fit an exponential or linear dose-response function, only two doses for each agent in the indirect evidence are needed to estimate dose-response functions for all agents, which means that we can estimate indirect evidence for all nine indirect dose-response comparisons. Out of the nine comparisons that could be analyzed using a single parameter dose-response relationship, inconsistency was identified in two comparisons (comparisons 6-7) that could be analyzed using an exponential MBNMA and one comparison (comparison 10) that could be analyzed using a linear MBNMA (Figures 7 and S5). Notably, the comparisons in which inconsistency was identified for the different MBNMAs were not the same. For other comparisons there was no evidence of inconsistency.

Overall, this suggested no inconsistency in treatment-level NMA or Emax MBNMA, but that fitting an exponential or linear MBNMA could lead to inconsistency in existing node-splits in which indirect evidence was estimated by both the consistency assumption and dose-response relationship, and that this could also lead to inconsistency in additional comparisons for which indirect evidence was only informed by the dose-response relationship.

5 DISCUSSION

Ignoring the effects of different doses is a common cause of heterogeneity and inconsistency in NMA. Dose-response MBNMA can reduce heterogeneity and inconsistency by modeling a dose-response relationship. However, it is still important to critique model fit and check consistency assumptions where possible. We have presented several methods to assess model consistency both globally and for specific contrasts.

The methods were illustrated with a dataset of trials comparing different agents for the treatment of moderate-to-severe psoriasis, which had features that are typical in dose-response MBNMA, comprising of (i) phase 2b dose-ranging studies where studies have multiple doses of a single agent that are compared with Placebo in multiarm trials, perhaps with a single (licensed) dose of a competitor also included in the study; (ii) phase 3 studies that compare two active agents head to head, each at a single dose. Although in this dataset there were reasonable opportunities for assessing consistency, networks comprised more exclusively of phase 2b dose-ranging studies may have more limited scope for this. The very features of such studies (multiarm trials, comparisons with placebo) are important mechanisms that guard against inconsistency, and so should not be viewed as a problem, but rather as part of the solution to avoid inconsistency in the first place. Only 5 out of 23 studies in the network were 2-arm studies, and the other 18 were multi(≥3)-arm studies comparing multiple doses of one or more agents.

We did not find any evidence of inconsistency in the psoriasis dataset when models were used that fitted the data appropriately (treatment-level NMA, Emax MBNMA), but the Emax MBNMA had the advantage that at several doses relative effects were estimated more precisely. However, when an exponential or linear dose-response relationship was used this led to a high level of heterogeneity and the identification of inconsistency within multiple comparisons in the network. Tests for consistency are inextricably linked to model fit and therefore depend on the dose-response relationship assumed. Thorough assessment of model fit for a range of alternative dose-response relationships is therefore recommended, and we advise selection of the best fitting dose-response function prior to comparing common vs random treatment effect models, as the impacts of an inappropriate dose-response function can be masked by fitting random effect models with resulting high heterogeneity.

In the presence of high heterogeneity (such as in exponential MBNMA, linear MBNMA, and agent-level NMA models) a global assessment by comparison with treatment-level NMA and UME models may fail to clearly identify evidence of inconsistency, since inconsistency can instead manifest as high heterogeneity.3, 28 Node-splitting can be a useful tool for highlighting the difference between these forms of heterogeneity by specifically testing for inconsistency on the set of comparisons for which direct and indirect evidence contributions are available.

In MBNMA node-splitting models, indirect estimates can sometimes be obtained via the dose-response relationship even when a pathway of direct evidence is not available (comparisons 5-13) and indirect estimates are not possible for standard NMA. The dose-response relationship can be used in the same way to estimate effects in otherwise disconnected networks.5 This does not specifically require the inclusion of placebo data, and the methods can be used in the absence of this, although placebo arms typically provide a lot of information to the dose-response relationship, particularly for parameters relating to the efficacy at lower doses (eg, ED50a). In networks with limited evidence at lower doses the indirect estimates via the dose-response relationship may be biased in some circumstances, which may impact the assessment of inconsistency using node-splitting.5

Simpler dose-response functions provide more potential to obtain indirect estimates via the dose-response relationship and therefore increase the availability of comparisons on which node-splitting is possible, yet if the true underlying dose-response relationship requires a more complex function, fitting a simpler function may introduce bias. This likely explains the inconsistency identified by node-splitting in several comparisons for exponential and linear MBNMA models. Ensuring that comparisons are only split if the indirect evidence contains sufficient information with which to estimate the same dose-response function as used in the MBNMA consistency analysis limits this bias. In the psoriasis example for our selected Emax MBNMA we would therefore in practice only choose to split comparison 5 (Ustekinumab 5.62 mg/wk vs Ixekizumab 30 mg/wk) in addition to the comparisons for which indirect evidence is available via consistency relationships (comparisons 1-4), given the doses available in the dataset for both agents in this comparison.

We have also noted that the effects of inconsistency and poor dose-response fit on deviance contributions for individual data points can be aggregated, leading to substantially higher contributions to the posterior mean deviance in a MBNMA model compared to the treatment-level UME. This does not appear to affect the total fit of the model, suggesting that it is balanced by improved fit of other data points. However, it is important to be aware of the causes of this when inspecting deviance-deviance plots and to identify to what degree the difference is caused by inconsistency vs fit of the dose-response curves. Given that there was no evidence of inconsistency in node-splitting for the Emax MBNMA, the issues identified in deviance-deviance plots for studies 15 and 16 that included Ustekinumab was most likely due to poor dose-response fit rather than inconsistency, and it may be that the dose-response relationship for this agent is different than for other biologics. However, given that the dose-response function for Ustekinumab implied by the treatment-level NMA results would be nonmonotonic (Figure 5—the posterior median for the relative effect of Ustekinumab 5.62 mg/wk is lower than at both 3.75 and 7.5 mg/wk), there may be some argument to suggest that the results of the Emax MBNMA at these doses are more biologically plausible.

Whilst we describe methods for identifying inconsistency in dose-response MBNMA, we have not yet suggested steps to take if it is found. Given the link between inconsistency and dose-response fit that we have shown here, our first suggestion would be to ensure that the fit of the dose-response function is reasonable. If this is not the source of inconsistency then other c

留言 (0)

沒有登入
gif