Erosions on T1-Weighted Magnetic Resonance Imaging Versus Radiography of Sacroiliac Joints in Recent-Onset Axial Spondyloarthritis: 2-Year Data (EMBARK Trial and DESIR Cohort)

Abstract

Objective (1) To compare the capacity to detect sacroiliac joint (SIJ) erosions and baseline-to-week 104 change in erosions between magnetic resonance imaging (MRI) and radiographs in recent-onset axial spondyloarthritis (axSpA); and (2) to compare treatment-discriminatory capacities of MRI and radiographic scores for erosion detection in patients receiving etanercept in the Effect of Etanercept on Symptoms and Objective Inflammation in Nonradiographic axSpA (EMBARK) trial vs controls in the DESIR (Devenir des Spondylarthropathies Indifférenciées Récentes) cohort.

Methods Anonymized SIJ MRI and radiographs were assessed at patient and joint surface levels. Three readers evaluated MRI; 3 different readers evaluated radiographs. Final scores for comparison of radiographs and MRI for detection of erosions were assigned based on agreement of ≥ 2 of 3 readers’ assessments.

Results At baseline, discordance in erosion detection between imaging methods was more frequent for MRI erosions in the absence of radiographic erosions (48/224 [21.4%] patients) than for radiographic erosions in the absence of MRI erosions (14/224 [6.3%] patients; P < 0.001). After 104 weeks, a decrease in erosions was observed on MRI but not radiographs in 49/221 (22.2%) patients, and on radiographs but not MRI in 6/221 (2.7%) patients (P < 0.001). In the treatment-discriminant capacity analysis, the largest standardized differences between etanercept and control cohorts at week 104 were changes in Spondyloarthritis Research Consortium of Canada MRI erosion discrete score, changes in erosion average score, and meeting the modified New York criteria on radiographs, with unadjusted/adjusted Hedges G effect sizes of 0.40/0.50, 0.40/0.56, and 0.40/0.43, respectively.

Conclusion In recent-onset axSpA, SIJ erosions and erosion change were observed more frequently on MRI than radiography. The significance of interval improvement of MRI erosions warrants further research. [ClinicalTrials.gov: NCT01258738, NCT01648907]

Key Indexing Terms:

In patients with axial spondyloarthritis (axSpA), structural damage at the sacroiliac joint (SIJ) is evaluated using radiographs, computed tomography (CT), or magnetic resonance imaging (MRI).1 Assessment of the SIJ using plain radiography lacks reliability, particularly for assessment of change, and it also lacks sensitivity to change.2,3 In patients with recent-onset axSpA, abnormalities of the SIJ may not (yet) be evident on radiography.4 A 5-year evaluation of a French prospective cohort study (DESIR [Devenir des Spondylarthropathies Indifférenciées Récentes]) of patients with symptoms of recent-onset axSpA demonstrated worsening and improvement of radiographic sacroiliitis; net progression was observed on radiography in only 13% of patients.5

MRI can detect several types of structural lesions, including erosions.6 Some published studies have shown that sensitivity to detect structural lesions in the SIJ is greater with T1-weighted MRI than with radiography, particularly for erosions.2,7 However, in patients with recent-onset axSpA, the advantage of MRI assessment of SIJ structural lesions over radiography is unclear.2,8

We compared baseline and week 104 T1-weighted MRI and radiographs of the SIJ from patients with recent-onset axSpA receiving the tumor necrosis factor inhibitor etanercept (ETN) in a clinical trial (Effect of Etanercept on Symptoms and Objective Inflammation in Nonradiographic axSpA [EMBARK]), and from similar patients not receiving biologics in the DESIR study. The objectives were (1) to compare the capacity to detect SIJ erosions and baseline-to-week 104 change in erosions between MRI and radiographs; and (2) to compare the treatment-discriminatory capacities (treatment effect sizes) of MRI and radiographic scores for detection of erosions in patients receiving ETN in EMBARK vs controls in DESIR.

We hypothesized that, compared with radiography, MRI would be a more sensitive method of detecting change from baseline in erosions, and be better able to discriminate the changes in erosion seen with treatment vs no treatment.

METHODS

Patient population. Details of the EMBARK clinical trial8-10 and the DESIR observational cohort study11-13 are published (see Supplementary Material, available with the online version of this article). Patients from EMBARK who had ≥ 1 dose of the study drug with available baseline radiographic and MRI data are included in this posthoc analysis. Patients from DESIR were included if they had not received any biologic therapy during the first 2 years of follow-up, met the Assessment of Spondyloarthritis international Society (ASAS) criteria for axSpA, and had baseline radiographic and MRI data. We adjusted for the following baseline values: symptom duration, HLA-B27 status, and continuous baseline of imaging (MRI or radiographic) endpoint.

Ethics. Studies were conducted according to the International Conference on Harmonisation guidelines for Good Clinical Practice and the ethical principles of the Declaration of Helsinki. Institutional review board (IRB) approval and informed consent from participants were obtained prior to the studies’ start. Details regarding the IRB approvals are previously published.13 [EMBARK ClinicalTrials.gov identifier: NCT01258738; DESIR ClinicalTrials.gov identifier: NCT01648907]

MRI and radiographic assessments. MRI scans from both cohorts were anonymized, combined, and read per patient, with readers unaware of imaging chronology and original cohort. The same was done for regular radiographs of the anteroposterior pelvis. Three readers read baseline and week 104 MRI (WPM, MdH, RGL), and 3 readers read baseline and week 104 radiographs (AM, MdH, PC); assessments were conducted independently.

MRIs were evaluated using the Spondyloarthritis Research Consortium of Canada (SPARCC) SIJ structural score.6 This assesses erosions in SIJ quadrants (right and left iliac bone, right and left sacrum) of 5 consecutive slices that depict the cartilaginous portion of the SIJ starting from the transitional slice. It includes backfill, fat lesions, and anklyosis, but not sclerosis, which is considered nonspecific for axSpA. The erosion score ranges from 0 to 8 for each of 5 consecutive slices of the SIJ, for a total score of 0-40.6 Since backfill and fat lesions are not visible by radiography, information on these lesions was not included for the MRI or radiography analyses. Radiographic erosion was scored by experienced readers (AM, MdH, PC) as present/absent (1/0) at each of the 4 joint surfaces (right and left iliac, right and left sacrum) for a total score of 0-4. This methodology was used because the modified New York (mNY) radiographic grading for sacroiliitis, normally used by rheumatologists, assesses a composite of all radiographic features of sacroiliitis (not just erosion).

Comparison of detection of MRI erosion vs radiographic erosion or sacroiliitis. For MRI and radiographs, at baseline and at week 104, erosion was categorized as present (score > 0) or absent (score = 0) at each timepoint following agreement by ≥ 2 of 3 readers at the patient level and at joint surface level. Presence/absence of radiographic sacroiliitis was similarly determined at each timepoint, at patient level only. Change in erosion from baseline to week 104 was categorized as either a decrease in erosion (change in score < 0 per reader with ≥ 2 of 3 readers’ agreement) for both MRI and radiographic erosion (including cases in which none of the readers agreed; score = 0), or an increase in erosion (change in score > 0 following agreement by ≥ 2 of 3 readers). Change in erosion score was calculated at patient level and each of the 4 joint surfaces. For MRI assessment, each patient’s left ilium was calculated as the sum of left upper ilium and left lower ilium. The same was done for each right ilium (sum of right upper ilium + right lower ilium), and for left sacrum and right ilium. For each patient, change in erosion was calculated for left ilium, right ilium, left sacrum, and right sacrum. Change in radiographic sacroiliitis was calculated at patient level only.

Comparison of MRI and radiographic endpoints for treatment discrimination. Hedges G treatment effect size for the assessment of treatment-discriminatory capacity was calculated on the following MRI and radiographic endpoints (Supplementary Table S1, available with the online version of this article). MRI endpoints are as follows:

•    Change in SPARCC MRI erosion discrete score. Patient assigned a value of −1 (change < 0), 0 (change = 0), or 1 (change > 0), based on ≥ 2 of 3 readers agreeing with the change category/value. If no readers agreed, change = 0 value was assigned.

•    Average change in SPARCC MRI erosion score for all 3 readers per patient (range, −40 to +40).

Radiographic endpoints:

Radiographic changes were analyzed in accordance with the conventional mNY scoring system, with the possibility for analysis as a dichotomous variable (ie, fulfillment of the mNY radiographic criteria [at least grade 2 bilateral or grade 3 unilateral] or as a continuous variable [ie, a 0-4 grade for each (left and right) SIJ, resulting in a total 0-8 score for a specific timepoint and an −8 to +8 score for changes observed between 2 timepoints]). Changes between 2 timepoints were assessed by evaluating patients with a change in mNY score ≥ 1 grade in at least 1 SIJ, and patients with negative baseline mNY radiographic criterion who turned positive after 2 years. Since radiographic sacroiliitis as assessed using the mNY criteria is a composite of all structural features observed on radiographs, we considered it important to also assess individual radiographic lesions to determine which ones change over the 2-year time frame of the study, and in which direction (increase or decrease). Radiographic structural parameters were evaluated as follows:

•    Erosion: radiographic erosion (present/absent [1/0] at each of the 4 joint surfaces [right and left iliac, right and left sacrum]) for an overall total score of 0-4.

•    Ankylosis: Change in radiographic ankylosis, average score (range for change: −2 to 2).

•    Sclerosis: Change in radiographic sclerosis, average score (range for change: −4 to 4).

•    Joint space: Change in radiographic joint space narrowing/widening, average score (range for change: −2 to 2).

•    Erosion: Change in radiographic erosion, average score (range for change: −4 to 4).

•    Sacroiliitis: change in radiographic sacroiliitis, average score (range for change: −8 to 8).

•    Erosion: Change in radiographic erosion, discrete score (range for change: −4 to 4).

Statistical analysis. Baseline characteristics between cohorts were compared using the Wilcoxon rank-sum for continuous characteristics and the Mantel-Haenszel chi-square test for categorical characteristics.

Concordance between MRI and radiography was evaluated using kappa coefficient of agreement (Embedded ImageEmbedded Image). Agreement was interpreted as Embedded ImageEmbedded Image < 0.20, poor; 0.21-0.40, fair; 0.41-0.60, moderate; 0.61-0.80, good; and 0.81-1.00, very good.14 McNemar test assessed whether the proportion of pairs with erosions detected on MRI (erosion present on both MRI and radiographs + present on MRI but not on radiographs) was the same as the proportion of pairs with erosions detected on radiographs (erosion present on both MRI and radiographs + present on radiographs but not on MRI).

Two treatment-discriminant capacity analyses using treatment effect sizes determined if larger differences between treatments for change from baseline to week 104 could be detected for MRI endpoints than for radiographic endpoints, including mean change in continuous, discrete, and binary endpoints. One analysis did not adjust for baseline covariates, whereas the other adjusted for the following baseline values: symptom duration, HLA-B27 status, and continuous baseline of endpoint. The unadjusted model was a 1-way ANOVA of MRI or radiographic endpoints with treatment as a factor. The adjusted model was an ANCOVA that contained covariates along with the treatment factor. For each MRI and radiographic endpoint, unadjusted and adjusted within-treatment and between-treatment effect sizes were calculated. Within-treatment effect sizes assessed the sensitivity to change in erosions at week 104; between-treatment effect sizes (Hedges G effect sizes) evaluated treatment-discriminant capacity. Effect sizes were interpreted as follows: ≥ 0.2, small; ≥ 0.5, medium; and ≥ 0.8, large.15

Interreader reliability was assessed by evaluating the degree of reader agreement for (1) MRI and radiographic erosion status (presence/absence) by each of the 4 joint surfaces at baseline and week 104; and (2) week 104 change (< 0, = 0, > 0) from baseline for radiographic mNY criteria (Y/N) and for week 104 change in mNY score ≥ 1 grade in ≥ 1 SIJ. The first analysis consisted of all 3 readers agreeing or only 2 of 3 agreeing on the presence or absence of MRI (or radiographic) erosions in each of the 4 joint surfaces at baseline and week 104. For the second, week 104 change from baseline for radiographic mNY criteria (yes/no) classified patients based on change in mNY (yes/no) score from baseline to week 104 (patients assigned to a change category [< 0, = 0, > 0] if ≥ 2 of 3 readers agreed, or, if not, to “none of the 3 readers agree” category). Change categories (< 0, = 0, > 0) were similarly assigned to each patient for change in mNY of ≥ 1 grade in ≥ 1 SIJ, with shift from 0 to 1 or 1 to 0 considered as no change.

Reliability of reader agreement was measured using Fleiss Embedded ImageEmbedded Image statistic, which analyzes whether all 3 readers agree. Therefore, the Fleiss Embedded ImageEmbedded Image statistic is a more stringent criterion than the criterion of ≥ 2 of 3 readers agreeing that was used for all other categorical analyses. Strength of agreement was interpreted as follows: 0-0.2, slight; ≥ 0.21, fair; ≥ 0.41, moderate; and ≥ 0.61, substantial.16

RESULTS

Patient characteristics. EMBARK included 225 randomized patients. Both MRI and radiographs were available for 156 (69.3%), 154 (68.4%), and 153 (68%) patients at baseline, week 104, and both timepoints, respectively. DESIR enrolled 708 patients, of whom 259 (36.6%) had a 2-year MRI planned per protocol. Of these, 68 (26.3%) met the ASAS criteria for axSpA, did not receive biologic treatment during the first 2 years of follow-up, and had both baseline and week 104 MRIs and radiographs.

In both EMBARK and DESIR, the mean age was 32 years; ~60% were men (Table). Mean symptom duration was significantly longer in EMBARK than DESIR (2.4 vs 1.7 yrs; P < 0.001), and mean baseline function (Bath Ankylosing Spondylitis Functional Index [BASFI]) was significantly worse in EMBARK (4.0 vs 2.2; P < 0.001). Baseline disease activity markers of Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), Ankylosing Spondylitis Disease Activity Score based on C-reactive protein (ASDAS-CRP), and SPARCC MRI SIJ inflammation were significantly higher in EMBARK, but no significant difference in SPARCC MRI erosion scores was observed. DESIR included a higher proportion of patients who were HLA-B27–positive and smokers. Baseline radiographic assessments indicated significantly more radiographic findings in DESIR.

Table.

Baseline demographics and disease characteristics (patients with baseline and/or week 104 MRI and radiographic assessments).

MRI–radiograph concordance at patient level: baseline and week 104. At baseline, erosions were present on both MRI and radiographs in 50/224 (22.3%) patients and absent on both in 112/224 (50%); Embedded ImageEmbedded Image agreement was moderate (0.42, 95% CI 0.30-0.53; P < 0.001; Figure 1A). For discordant cases, MRI erosions in the absence of radiographic erosions (48/224 [21.4%] patients) were more common than radiographic erosions in the absence of MRI erosions (14/224 [6.3%]). Values at week 104 were similar to baseline (Figure 1B).

Figure 1.Figure 1.Figure 1.

Proportion of patients with/without (A) erosions on MRI and erosions on radiographs at baseline, (B) erosions on MRI and erosions on radiographs at week 104, (C) erosions on MRI and sacroiliitis on radiographs at baseline, and (D) erosions on MRI and sacroiliitis on radiographs at week 104. Baseline, n = 224 and week 104, n = 222. Sacroiliitis refers to inflammation of sacroiliac joint. Includes patients with baseline and/or week 104 MRI and radiographic assessments. P value based on McNemar test assessing whether the proportion of pairs with erosions detected on MRI (present on both MRI and radiographs + present on MRI but not radiographs) is the same as the proportion of pairs with erosions detected on radiographs (present on both MRI and radiographs + present on radiographs but not MRI). Abs: absent; MRI: magnetic resonance imaging; Pres: present; Rad: radiograph.

Erosions on MRI and sacroiliitis on radiographs were both present at baseline in 60/224 (26.8%) patients and were both absent in 97/224 (43.3%); Embedded ImageEmbedded Image agreement was fair (0.39; Figure 1C). MRI erosions were present without sacroiliitis on radiographs in 38/224 (17%) patients and sacroiliitis was present without MRI erosions in 29/224 (12.9%) of patients (Embedded ImageEmbedded Image 0.39, 95% CI 0.26-0.51; P = 0.27). Values at week 104 were similar (Figure 1D).

Change from baseline to week 104. Between baseline and week 104, decrease in erosions was present on both MRI and radiographs in 4/221 (1.8%) patients and absent on both in 162/221 (73.3%; Figure 2A). For discordant cases, decrease in MRI erosions in the absence of decrease in radiographic erosions (49/221 [22.2%] patients) was more common than decrease in radiographic erosions in the absence of decrease in MRI erosions (6/221 [2.7%]). Decreases in both erosions on MRI and sacroiliitis grade on radiographs were similar to those shown in Figure 2A (Figure 2B).

Figure 2.Figure 2.Figure 2.

Proportion of patients with/without (A) erosion decrease on MRI and erosion decrease on radiographs, (B) erosion decrease on MRI and sacroiliitis grade decrease on radiographs, (C) erosion increase on MRI and erosion increase on radiographs, and (D) erosion increase on MRI and sacroiliitis grade increase on radiographs, from baseline to week 104 (n = 221). Includes patients with baseline and/or week 104 MRI and radiographic assessments. The small Embedded ImageEmbedded Image statistics do not correspond with the high proportion of agreement, which may be due to the small sample size in the “Pres MRI, Pres Rad” category. Abs: absent; MRI: magnetic resonance imaging; Pres: present; Rad: radiograph.

For discordant cases, increase in MRI erosions in the absence of increase in radiographic erosions (15/221 [6.8%] patients) was similar to increase in radiographic erosions in the absence of increase in MRI erosions (17/221 [7.7%]). Values for increases in erosions on MRI and in sacroiliitis grade on radiographs were similar to those shown in Figure 2C (Figure 2D).

MRI–radiograph concordance at joint level: Baseline and week 104. At baseline, erosions were present on MRI and radiographs of the left ilium for 28/224 (12.5%) patients and of the right ilium for 25/224 (11.2%), and were absent on MRI and radiographs of the left ilium in 132/224 (58.9%) patients and of the right ilium in 146/224 (65.2%) patients; Embedded ImageEmbedded Image agreement was fair (0.29, 95% CI 0.17-0.42 and 0.35, 95% CI 0.22-0.48, respectively; P < 0.001 for both; Figure 3A). For discordant cases, MRI erosions in the absence of radiographic erosions of the left and right ilium (50/224 [22.3%] and 42/224 [18.8%] patients, respectively) were more common than radiographic erosions in the absence of MRI erosions of the left and right ilium (14/224 [6.3%] and 11/224 [4.9%] patients, respectively). Week 104 results were similar to baseline results (Figure 3B).

Figure 3.Figure 3.Figure 3.

Proportion of patients with/without (A) erosions on MRI and erosions on radiographs of the left or right ilium at baseline and (B) at week 104, and (C) erosions on MRI and erosions on radiographs of the left or right sacrum at baseline and (D) at week 104. Baseline, n = 224 and week 104, n = 222. Includes patients with baseline and/or week 104 MRI and radiographic assessments. P value based on McNemar test assessing whether the proportion of pairs with erosions detected on MRI (present on both MRI and radiographs + present on MRI but not radiographs) is the same as the proportion of pairs with erosions detected on radiographs (present on both MRI and radiographs + present on radiographs but not MRI). Comparisons of MRI and radiographic change in erosions were performed at the individual joint level. Abs: absent; MRI: magnetic resonance imaging; Pres: present; Rad: radiograph.

For the left and right sacrum, the number of discordant cases was relatively small and differences not significant (Figures 3C,D).

Change from baseline to week 104. In most cases (83.3-97.3%), neither increases nor decreases in erosions were detected on MRI or radiographs at the level of the individual joint surface (Figures 4A-D). For discordant cases, decrease in MRI erosions in the absence of decrease in radiographic erosions of the left and right ilium (31/221 [14%] and 30/221 [13.6%] patients, respectively) was more common than decrease in radiographic erosions in the absence of decrease in MRI erosions (5/221 [2.3%] and 3/221 [1.4%] patients, respectively; P < 0.001 for both comparisons; Figure 4A). For the right sacrum, discordance showed more frequent decrease in MRI erosions in the absence of decrease in radiographic erosions (8/221 [3.6%] patients) than decrease in radiographic erosions in the absence of decrease in MRI erosions (1/221 [0.5%] patients; P = 0.02); the values in the left sacrum were similar but did not meet statistical significance (Figure 4B). Discordance in erosion increase was not significant in the ilium or sacrum, and the number of cases was low (Figures 4C,D).

Figure 4.Figure 4.Figure 4.

Proportion of patients with (A) erosion decrease on MRI and radiographs of the left and right ilium and (B) left and right sacrum, and (C) erosion increase on MRI and radiographs of the left and right ilium and (D) left and right sacrum, from baseline to week 104 (n = 221). Includes patients with baseline and/or week 104 MRI and radiographic assessments. P value based on McNemar test assessing whether the proportion of pairs with erosions detected on MRI (present on both MRI and radiographs + present on MRI but not radiographs) is the same as the proportion of pairs with erosions detected on radiographs (present on both MRI and radiographs + present on radiographs but not MRI). Abs: absent; MRI: magnetic resonance imaging; Pres: present; Rad: radiograph.

Treatment-discriminant capacity analysis. The discriminant analyses included patients in EMBARK and DESIR with both baseline and week 104 MRI and radiographic endpoints. In the unadjusted analysis, the largest standardized differences between ETN and control cohorts at week 104 were in change in SPARCC MRI erosion discrete score (increase or decrease or no change in erosion per agreement of ≥ 2/3 readers summed for all 4 joint surfaces [range, −4 to +4]), change in SPARCC MRI erosion average score for all 3 readers (range, −40 to +40), and meeting the mNY criteria, with Hedges G effect sizes of 0.40, 0.40, and 0.40, respectively (Figure 5). There was a greater decrease from baseline in degree of MRI erosion (based on mean change) in EMBARK (−0.80) compared with DESIR (−0.07; results not shown), with the large difference between EMBARK and DESIR means contributing to the large Hedges G effect size. Within EMBARK, effect sizes were larger (demonstrating greater sensitivity to change) for MRI vs radiographic endpoints (Supplementary Table S2, available with the online version of this article).

Figure 5.Figure 5.Figure 5.

Unadjusted mean change from baseline to week 104 and corresponding effect size estimates in MRI and radiographic endpointsa in EMBARK and DESIR. a Includes patients who have nonmissing reader scores and have both a baseline and week 104 MRI erosion as well as baseline and week 104 radiograph mNY score. Results from 1-way ANOVA of endpoint with study as a factor. b Hedges G effect size was calculated as follows: absolute (mean change EMBARK – mean change DESIR) / √RMSE, where √RMSE is the pooled weighted SD of both studies. Effect sizes were interpreted as follows: ≥ 0.2, small; ≥ 0.5, medium; and ≥ 0.8, large.15c Patients were assigned a value of −1 (change < 0), 0 (change = 0), or 1 (change > 0) based on ≥ 2/3 readers agreeing with the change category/value. If none of the 3 readers agreed, then the patient was assigned to change = 0 value. d Change from baseline obtained from the average of the 3 readers’ change scores (continuous data). e Change from baseline obtained from the average of either 2 readers’ (or 1 reader and 1 adjudicator) change scores. f Mean of the 3 readers’ change scores for ankylosis, sclerosis, radiographic joint narrowing, radiographic joint widening, erosion, and sacroiliitis (continuous data) calculated for each timepoint and then change from baseline obtained. g Erosion calculated as ≥ 2/3 readers agreeing on presence (= 1) or absence (= 0) in each of the 4 joint surfaces (left ilium, right ilium, left sacrum, right sacrum). All 4 joint surfaces were then summed by time point and change from baseline then calculated. h Change: −1 = (change < 0) = improvement; 0 = (change = 0) = no change; 1 = (change > 0) = worsening. Patients were assigned a value of −1, 0, or 1 based on ≥ 2/3 readers agreeing with the change category/value. If none of the 3 readers agreed, then the patient was assigned to change = 0 value. For change in mNY score ≥ 1 grade in ≥ 1 SIJ, shift from 0 to 1 or 1 to 0 from baseline to week 104 was considered as no change. Change in radiographic sacroiliitis (ie, mNY), average score (range for change: −8 to 8). For all endpoints that analyzed presence/absence or change categories < 0, = 0, > 0 based on agreement of 2/3 readers, patients were included in the analysis if ≥ 2 readers agreed with each other, regardless of whether third reader had a missing score. For change from baseline endpoints, if all 3 readers had nonmissing reader score but none of the 3 readers agreed, the patient was assigned a value of 0. DESIR: Devenir des Spondylarthropathies Indifférenciées Récentes; EMBARK: Effect of Etanercept on Symptoms and Objective Inflammation in Nonradiographic axSpA; axSpA: axial spondyloarthritis; mNY: modified New York; MRI: magnetic resonance imaging; MSE: mean square error; RMSE: residual mean square error; SIJ: sacroiliac joint; SPARCC: Spondyloarthritis Research Consortium of Canada.

Results of the adjusted analysis were comparable in strength and direction, albeit slightly larger than the unadjusted analysis (Supplementary Figure S1, available with the online version of this article).

Interreader reliability. Interreader reliability regarding the presence or absence of MRI and radiographic erosion was mostly fair to moderate at both baseline and week 104 (Supplementary Tables S3 and S4, available with the online version of this article). Interreader reliability for change in erosion was lower than for absolute presence/absence of erosion for both MRI and radiographs (Supplementary Tables S5 and S6).

DISCUSSION

This analysis of the comparison between MRI and radiographs of the SIJ in patients with recent-onset axSpA demonstrates that a decrease in erosions is noted more frequently on MRI than on radiographs and occurs more frequently than an increase in MRI erosions. Additionally, a decrease in erosions on MRI is noted more frequently than a decrease in sacroiliitis on radiographs. These erosion results are similar to those of a study of patients with chronic low back pain suggestive of SpA, which reported that MRI had greater sensitivity for detecting SIJ erosions than radiography7; approximately half of the 110 patients were eventually diagnosed with axSpA (either radiographic or nonradiographic).7 The current study focused on patients with axSpA who fulfilled the ASAS criteria and had recent-onset disease but did not meet mNY criteria for radiographic sacroiliitis.

There are several possible explanations for these results. MRI detects smaller erosions than radiography does in patients with rheumatoid arthritis (RA),17 so we could reasonably expect MRI to detect more erosions and be more sensitive to erosion change compared with radiography in patients with axSpA. The 2 modalities differ and tell us different things. The size of erosions could have an effect: large erosions (detectable with both modalities) may be associated with extensive cartilage destruction and a limited ability for articular surface repair, whereas very small erosions (detectable on MRI only) may be associated with a partially intact articular surface capable of repair. Small erosions in RA can undergo repair, and there may be a size threshold that limits this evolutionary pathway.18-21 Further, the scales for radiographic and MRI erosion scores differ, and the responsiveness of the radiographic scale is likely more limited. However, the increased sensitivity of MRI over radiography was also evident at the level of the individual joint surface.

The treatment-discriminant capacity (unadjusted and adjusted analyses) found effect sizes generally similar for the 2 modalities based on Hedges G values, although for erosions, Hedges G was larger for MRI. The treatment-discriminant capacity analyses demonstrated a greater decrease from baseline in the degree of MRI erosion in EMBARK vs DESIR, and some larger treatment effect sizes and greater sensitivity to change in EMBARK for MRI vs radiographic endpoints; these data have not been reported previously. This result is consistent with the decrease between baseline and week 12 in the ETN but not the placebo group of EMBARK.22

Among all variables included in the adjusted analysis, only baseline SPARCC MRI inflammation is a predictor of MRI erosion,13 and such inflammation was much higher in EMBARK than DESIR at baseline. This, and other differences in disease or patient characteristics, may reflect different study entry criteria. In the unadjusted analysis, radiographic sacroiliitis demonstrated comparable discrimination to MRI erosion (effect size: 0.40 each), but this primarily reflected worsening of radiographic features in DESIR, with little change in EMBARK. In contrast, MRI erosion improved in EMBARK and changed little in DESIR. These observations could at least partially reflect the higher radiographic sacroiliitis score at baseline for cases in DESIR, as higher baseline damage might predict greater future damage. However, radiographic sacroiliitis comprises a composite of destructive (erosion) and reparative (sclerosis) features, and interpretation is complicated by a requirement to evaluate individual features that may individually demonstrate little change and/or changes in opposite directions over time. By contrast, MRI erosion is a unidimensional construct reported to be more sensitive and specific for erosion compared with radiography using CT as gold standard.7 Reliable detection of change and discrimination of active drug from placebo is demonstrated in the 12- to 16-week time frame of 3 randomized placebo-controlled trials of biologic and synthetic targeted therapies in both nonradiographic and radiographic axSpA.22-24 Since acceptable targets for scoring proficiency of MRI structural lesions is consistently attained using real-time iterative calibration technology, it is reasonable to expect that discrimination capacity for MRI erosion may be improved.25

This study had some limitations. It was not a prospective randomized, controlled trial; rather, we combined MRIs and radiographs from 2 studies where populations had some differences in their baseline characteristics, which could potentially bias results. Effect size calculations were based on underlying normal distributions and may be affected by the limited range of scores (−1, 0, 1) for discrete change score endpoints. Interreader reliability was limited, and therefore strong conclusions cannot be drawn from the results. The radiographic scores were limited according to inclusion criteria; patients with an mNY score that was too high were excluded from both EMBARK and DESIR. These results may therefore not be generalizable to more advanced axSpA. Additionally, the number of patients with MRI or radiographic erosions in the sacrum was relatively low, making it difficult to draw conclusions, and there is potential for measurement error. This corresponds to published data in early axSpA reporting significantly greater involvement in the ilium than in the sacrum.26

In summary, erosions in the SIJ and erosion change were observed differently on MRI vs radiographs in patients with recent-onset axSpA. MRI was more sensitive than radiography in detecting erosions and change over time, as well as in treatment-discriminant capacity (for both unadjusted and adjusted results). The improvements in articular erosion seen on MRI may be an indicator of treatment response, which warrants further research.

ACKNOWLEDGMENT

The authors thank all patients who participated in this study, as well as the investigators and medical staff at all of the participating centers. Medical writing support was provided by Jennica Lewis, PharmD, CMPP, Iain McDonald, PhD, and David Sunter, PhD, of Engage Scientific Solutions, and was funded by Pfizer.

Footnotes

The EMBARK study was funded by Pfizer. The DESIR (Devenir des Spondylarthropathies Indifférenciées Récentes) cohort is supported by unrestricted grants from the French Society of Rheumatology and Pfizer.

WPM has received grant/research support from AbbVie, Novartis, Pfizer, and UCB; consulted for/received honoraria from AbbVie, BMS, Boehringer, Celgene, Galapagos, Janssen, Lilly, Merck, Novartis, Pfizer, and UCB; and is the chief medical officer of CARE Arthritis Ltd. PC has consulted for AbbVie, BMS, Celgene, Janssen, Merck, Novartis, Pfizer, Roche, UCB, and Lilly. MdH received consultancy fees for UCB and is owner of MdH Research. RGL has consulted for AbbVie, Parexel, Pfizer, and UCB. RL has received grant/research support from AbbVie, Amgen, Centocor, Novartis, Pfizer, Roche, and UCB; consulted for AbbVie, Ablynx, Amgen, AstraZeneca, BMS, Celgene, Janssen, Galapagos, GSK, Merck, Novartis, Novo Nordisk, Pfizer, Roche, Schering-Plough, TiGenix, and UCB; and is director of Rheumatology Consultancy BV. AM has received grant/research support from AbbVie, Pfizer, and UCB, and consulted for AbbVie, Janssen, Merck, Novartis, Pfizer, Sanofi, and UCB. DvdH has consulted for AbbVie, Bayer, BMS, Galapagos, Gilead, GSK, Janssen, Lilly, Novartis, Pfizer, Takeda, and UCB; and is director of Imaging Rheumatology BV. JFB, HJ, and RP were employees of Pfizer at the time the analysis was conducted. AS was an employee of Syneos Health and was contracted by Pfizer

留言 (0)

沒有登入
gif