Table 3 summarizes the segmentation performance of the MPUnet, KIQ, and 2D U-Net methods on all three study cohorts (see Tables 1 and 2). When trained on the same number of samples, the MPUnet performed significantly better in terms of the mean macro Dice scores (mean across compartments and patients) on the OAI dataset compared to KIQ ( vs.
,
), the 2D U-Net (
vs.
,
), and the single-view MPUnet (
vs.
,
. The MPunet performed significantly better on the CCBR dataset compared to KIQ (
vs.
,
), the 2D U-Net (
vs.
,
), and the single-view MPunet (
vs.
,
). The MPUnet performed significantly better on the PROOF dataset compared to the 2D U-Net (
vs.
,
) and the single-view MPunet (
vs.
,
) and indifferent from KIQ (
vs.
,
).
Table 3 also details the performance of all methods on each individual compartment across the three datasets and shows the minimal Dice scores observed for the compartment across all subjects in the cohort. Across a total of 14 segmentation compartments (tibia bone excluded as it is easily segmented by all methods), the MPUnet performed significantly better than the KIQ model on 11 compartments (TMC, TLC, FMC, FLC, PC, MM, and LM on OAI; TMC and FMC on CCBR; FLC and PC on PROOF; for all) and with no significant difference on the remaining 3 (TMC,
, TLC,
, and FMC,
, all on PROOF). The MPUnet performed significantly better than the Paniflov 2D U-Net on 10 compartments (FMC, PC, MM, and LM on OAI; FMC on CCBR; TMC, TLC, FMC, FLC, and PC on PROOF;
for all) and with no significant difference on the remaining 4 (TMC,
, TLC,
, FLC,
on OAI; TMC,
on CCBR). The MPUnet performed significantly better than its single-view counterpart on 12 compartments (TMC, FMC, FLC, PC, MM, and LM on OAI; TMC and FMC on CCBR; TMC, TLC, FMC, and FLC on PROOF;
for all) and with no significant difference on the remaining 2 (TLC,
on OAI; PC,
on PROOF). None of the other models performed significantly better than the MPUnet on any compartment.
Table 4 details the performance of each model on the CCBR, OAI, and PROOF datasets grouped by KL grade assessments of each scan. Figure 3 shows box-plot Dice score distributions for each compartment of the CCBR dataset as segmented by the MPUnet, KIQ, and 2D U-Net models similarly grouped by KL grades. Box-plot figures for the OAI and PROOF datasets are shown in Figs. S2 and S3 in the Supplemental Material.
TABLE 4. Single-Cohort Experiments — KL Groups: Segmentation Performance Across Subjects for the MPUnet, Single-View MPUnet, 2D U-Net, and KIQ Methods on the OAI, CCBR, and PROOF cohorts on KL Subgroups Dataset Method Eval. Type Eval. Images KL 0 KL 1 KL 2 KL 3 KL 4 CCBR KIQ Fixed split 50/24/13/22/0 0.84 ± 0.03 0.82 ± 0.03 0.78 ± 0.04 0.75 ± 0.08 — 0.73 0.72 0.68 0.57 P < 0.05 P < 0.05 P < 0.05 P < 0.05 2D U-Net Fixed split 50/24/13/22/0 0.84 ± 0.03 0.83 ± 0.03 0.80 ± 0.04 0.76 ± 0.06 — 0.77 0.75 0.72 0.64 P < 0.05 P = 0.03 P = 0.74 P < 0.05 MP (V = 1) Fixed split 50/24/13/22/0 0.84 ± 0.02 0.83 ± 0.03 0.78 ± 0.04 0.73 ± 0.06 — 0.79 0.77 0.68 0.59 P < 0.05 P < 0.05 P < 0.05 P < 0.05 MP (V = 6) Fixed split 50/24/13/22/0 0.85 ± 0.03 0.84 ± 0.03 0.81 ± 0.02 0.78 ± 0.06 — 0.80 0.77 0.77 0.69 OAI KIQ Fixed split 0/2/10/30/2 — 0.88 ± 0.03 0.84 ± 0.04 0.83 ± 0.04 0.83 ± 0.02 0.86 0.76 0.72 0.82 P = NA P = 0.23 P < 0.05 P = NA 2D U-Net Fixed split 0/2/10/30/2 — 0.87 ± 0.03 0.85 ± 0.04 0.85 ± 0.03 0.86 ± 0.02 0.85 0.78 0.77 0.85 P = N/A P = 0.16 P < 0.05 P = NA MP (V = 1) Fixed split 0/2/10/30/2 — 0.87 ± 0.02 0.84 ± 0.03 0.85 ± 0.03 0.86 ± 0.01 0.85 0.77 0.76 0.85 P = NA P < 0.05 P < 0.05 P = NA MP (V = 6) Fixed split 0/2/10/30/2 — 0.88 ± 0.03 0.86 ± 0.03 0.86 ± 0.04 0.87 ± 0.01 0.86 0.78 0.75 0.87 PROOF KIQ 25-CV 12/11/1/1/0 0.76 ± 0.06 0.77 ± 0.09 0.81 ± 0.00 0.80 ± 0.00 — 0.63 0.52 0.81 0.80 P = 0.08 P = 0.41 P = N/A P = N/A 2D U-Neta 25-CV 12/11/1/1/0
留言 (0)