IJERPH, Vol. 19, Pages 16080: Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series

Figure 1. An illustrating classification tree diagram. Y indicates a case and N indicates a non-case.

Ijerph 19 16080 g001

Figure 2. Visualization of the BART variable selection algorithm. The vertical lines are the threshold levels determined from the “null” distributions for variable inclusion proportions computed from 100 permutated data. Variable inclusion proportions from the original (unpermutated) data passing this threshold are displayed as solid dots. Open dots correspond to variables that are not selected.

Ijerph 19 16080 g002

Figure 3. Distributions of the inverse probability of treatment weights estimated by BART, random forest, and XGBoost.

Ijerph 19 16080 g003

Figure 4. A comparison of the distributions of values for total hip bone mineral density and total spine bone mineral density among the imputed values and among the complete cases.

Ijerph 19 16080 g004

Table 1. Variables selected by each method, and 5-fold cross-validated area under the receiver operating characteristics curve using each model with selected variables.

Table 1. Variables selected by each method, and 5-fold cross-validated area under the receiver operating characteristics curve using each model with selected variables.

MethodsSelected VariablesAUCBARTChalson comorbidity score, gender, married, histology, year of diagnosis0.85XGBoostAge, year of diagnosis0.72RFChalson comorbidity score, histology0.74

Table 2. Causal inferences about average treatment effects of three surgical approaches on postoperative respiratory complications based on the relative risk, using the SEER-Medicare lung cancer data. The 95% uncertainty intervals are displayed in parentheses. All 14 potential confounders were used. RAS: robotic-assisted surgery; VATS: video-assisted thoracic surgery; OT: open thoracotomy.

Table 2. Causal inferences about average treatment effects of three surgical approaches on postoperative respiratory complications based on the relative risk, using the SEER-Medicare lung cancer data. The 95% uncertainty intervals are displayed in parentheses. All 14 potential confounders were used. RAS: robotic-assisted surgery; VATS: video-assisted thoracic surgery; OT: open thoracotomy.

MethodsRAS vs. OTRAS vs. VATSOT vs. VATSBART0.94 (0.72, 1.16)1.09 (0.84, 1.34)1.12 (0.87, 1.37)XGBoost0.91 (0.64, 1.13)1.04 (0.79, 1.28)1.08 (0.84, 1.33)RF0.90 (0.63, 1.14)1.03 (0.78, 1.29)1.06 (0.82, 1.35)

留言 (0)

沒有登入
gif