Prediction of Treatment Effect of SLE-ITP Patients Based on Cost-Sensitive Neural Network and Variational Autoencoder

Immune thrombocytopenia is an autoimmune disease under the double blow of excessive destruction of platelets and insufficient platelet production by megakaryocytes mediated by humoral and cellular immunity.1,2 Immune thrombocytopenia can be divided into 2 categories: primary and secondary,3,4 and secondary ITP usually has a clear etiology or primary underlying disease, such as drugs, infections, tumors, hypersplenism, and systemic lupus erythematosus.5–7 Systemic lupus erythematosus (SLE) combined with immune thrombocytopenia (SLE-ITP) belongs to a specific manifestation of SLE. Studies have shown that there is a close relationship between SLE and ITP,8 the risk of ITP in SLE patients is several times higher than that of the general population,9 and approximately 20%–40% of patients with SLE also have ITP,10 also approximately 2%–5% of patients with ITP also have SLE.2,11

The mechanism of SLE-ITP is still not completely clear. Studies have shown that there may be abnormally activated immune cells involved in the occurrence and development of ITP in SLE patients.12,13 Compared with ITP, patients with SLE-ITP have more significant immune dysfunction, and the course and treatment of the disease are more complex and highly heterogeneous, with clinical manifestations ranging from asymptomatic to fatal bleeding. In SLE-ITP, there will be 20%~25% that develop moderate to severe thrombocytopenia.14 In addition, SLE-ITP is directly related to SLE morbidity and mortality, and SLE patients with low platelet counts have an increased mortality rate.15 For SLE patients, it is necessary to pay close attention to the occurrence of ITP and carry out timely diagnosis and treatment to avoid the severity and complication of the disease.16

Currently available treatment response criteria for thrombocytopenia are for primary,17,18 in clinical judgment of SLE-ITP prognosis as well as efficacy, and there are no text indicators that can be identified as good predictors of effective ITP treatment, nor has a set of treatment criteria been developed to guide the treatment of patients with SLE-ITP. In recent years, artificial intelligence (AI) algorithms are being increasingly used in the field of rheumatic disease diagnosis and treatment, which can effectively handle high-dimensional data such as multiomics through techniques such as feature selection and dimensionality reduction to better capture underlying patterns within the data,19,20 and can also establish nonlinear models such as neural network (NN) to predict patient prognosis. For instance, a recent study implemented AI to distinguish systemic lupus erythematosus and primary Sjögren syndrome using gene expression and methylation data from 651 individuals, and analyzed the impact of the heterogeneity of these diseases on the performance of predictive models.21 However, the reliability of AI models is highly correlated with the quality of the data, and the decision-making process of most models is difficult to interpret, which may challenge interpretability requirements in biological research and clinical applications.22

In this study, we retrieved 324 samples of patients with lupus and thrombocytopenia from the electronic health record database, including blood routine, blood lymphocytes, biochemical, antibody, complement, and other laboratory test indicators. We used the COX model based on least absolute shrinkage and selection operator (LASSO) for univariate and multivariate analysis to mine factors related to the treatment effect, and designed a cost-sensitive NN to predict the treatment effect of patients. In addition, to solve the problem of insufficient patient samples, we used variational autoencoders (VAEs) to expand the samples to achieve data enhancement. By predicting the curative effect of lupus patients with thrombocytopenia, the severity of thrombocytopenia patients can be prejudiced, and then the best treatment strategy can be planned to avoid ineffective treatment.

METHODS Study Population

Data were retrieved from the SLE laboratory test table, with a total of 5586 records and 1824 patient test records from 2010 to 2020. Among them, 902 patients had 2 or more platelet count records. According to the diagnostic criteria for thrombocytopenia, a total of 371 SLE patients with thrombocytopenia accounted for 20.3% of the total number of SLE patients. On this basis, 324 patients with SLE-ITP were finally identified as research samples in the light of the guidelines for the treatment response of adult ITP, of which 155 were treatment effective (including complete response and effective) and 169 were treatment ineffective (Fig. 1). Of the 324 patients included in this study, the majority exhibited mild thrombocytopenia, accounting for 62.5% of the cases (Supplementary Fig. S1, https://links.lww.com/RHU/A662). All the 324 SLE-ITP patients met the latest annual (2019) diagnostic criteria of the American Society of Hematology and the International Working Group on ITP.23

F1FIGURE 1:

Flowchart for Patient Selection.

Cox Proportional Risk Regression and Kaplan-Meier Estimation Based on LASSO

The LASSO algorithm is an optimization algorithm that compresses variables and reduces dimensionality by adding a penalty constraint to the least squares estimation such that some coefficients are estimated to be zero.24

Cox model,25 also known as proportional hazards regression model, is a statistical method for multivariate analysis of survival data.26 The basic form of the model as follows:

lnhth0t=β1x1+β2x2+…+βpxp

In Equation 1, β is the bias regression coefficient of the independent variable, and exp(β) is usually referred to as the hazards ratio (HR). In clinical practice, if the variable has β > 0 and HR > 1, indicating that the risk increases as the value of the variable increases, the variable is referred to as a bad prognostic factor, and conversely, the independent variable with β < 0 and HR < 1 is referred to as a good prognostic factor.

The Kaplan-Meier survival estimator is a nonparametric estimation method that divides the survival time into several periods, estimates the survival probability within each period, and then concatenates the survival probabilities for each period to obtain the survival probability curve over the entire survival period.27

In this study, we selected the best regularization coefficient based on LASSO for feature filtering by cross-validation, then input the selected features into Cox model, which took treatment time as the time variable, treatment effect (effective and ineffective) as the outcome variable, and 16 factors related to the patient's treatment effect determined based on LASSO regression and expert experience as covariates, and plotted Kaplan-Meier treatment effect probability curves to investigate the factors influencing treatment effect and treatment duration in SLE-ITP patients. The LASSO algorithm was constructed in Python 3.9, and the Cox model and Kaplan-Meier curves were run in R4.2.2.

Cost-Sensitive Neural Networks

In this study, we introduce back propagation NN28 to predict treatment effect, which is a multilayer feedforward NN consisting of an input layer, an implicit layer, and an output layer, and the model building process consists of a forward training process and a reverse error transmission process. The mathematical model can be represented by Equation 2:

minEωbvc=min12q∑p=1q∑k=1lykp−ŷkp2

In Equation 2, ω, v is the weight matrix, b, c is the threshold vector, yk(p) is the actual result of the kth neuron in the output layer of the pth sample, ŷkp is the prediction result of the kth neuron in the output layer of the pth sample, l is the total number of neurons in the output layer, and q is the total number of samples. E(ω, b, v, c) is the mean square error function, defined as in Equation 3:

MSE=12×batchsize∑kyk−outxk2

In the article, cost-sensitive learning is introduced into the NN model29 to resolve data imbalance problems, with a dynamic weighted mean square error loss function defined as follows:

dyn_weight_MSEyoutx=Weight×MSE

In Equation 4, y, outx are the true labels of the samples and the output value of the NN. Weight is determined by the equilibrium dynamics of each class, which is calculated once per batch and defined as:

Weight=1−true_label×zero_weight+true_label×one_weight

In Equation 5, zero_weight=NpN, one_weight=NnN, N is the total number of patients in each batch, Np is the number of ineffective patients in each batch, and Nn is the number of effective patients in each batch.

In this study, we built an NN model to predict patient treatment effects, the input of the model was the factors affecting the treatment effect of patients with 16 neurons, and the output was the treatment effect of patients with 2 neurons, 0 for effective treatment and 1 for ineffective treatment, and the hidden layer was set with 64 neurons. The hyperparameter learning rate was 0.0005 and the batch was 16.

Variational Autoencoder

Variational autoencoder uses an NN to fit both the inferential and generative models, where the inferential model is the encoding layer and the generative model is the decoding layer in the autoencoder (Supplementary Fig. S2, https://links.lww.com/RHU/A662). The goal of VAE is to learn the parametric implicit variable model by maximizing the marginal log-likelihood of the training data. The marginal likelihood of a single sample can be expressed as:

logpxi=DKLqθzxi∥pzxi+LVAEθxi

In Equation 6, DKL denotes the KL scatter between the approximate posterior distribution and the true posterior distribution, and LVAE denotes the evidence lower bound of the data point.

Different from the autoencoder network, the output of the VAE network in the hidden layer includes 2 dimensions, μ and σ, which represent the mean and variance of the parameters of the normal distribution by z, respectively, as follows:

qθzx=Nzμσ2I

Then the samples from the posterior distribution qθ(z| x) obtain a new hidden layer code z and input it into the generation network qφ(z| x) to generate a new x̂.

In this study, we designed a VAE model based on the above principles in the Python 3.9, using a Bernoulli distribution to fit the original binary values, and a Gaussian distribution to fit continuous values to generate relational data.

Statistical Analysis

The study used R4.2.2 and Python 3.9 for statistical analysis. Numerical data were expressed as mean and standard deviation, and categorical data were expressed as percentages. We used the Mann-Whitney U test to analyze nonnormally distributed data, and used the χ2 test to analyze categorical data. The p value of less than 0.05 was regarded as a significant difference, which was statistically significant.

RESULTS Statistical Analysis of Clinical Characteristics

Demographic and clinical characteristics of patients (Table 1) in this study revealed that the average complement-3 (C3) level was 0.70 g/L, generally below the lower limit of normal levels, whereas the average immunoglobulin E (IgE) level was 245.58 U/mL, generally exceeding the upper limit of normal levels. In terms of gender distribution, 89.5% of the patients were female, with the remaining 10.5% being male. The mean age of the patients was 41.07 years. We divided 324 SLE-ITP patients with clear labels into effective and ineffective groups, and performed group statistics and significant difference tests on laboratory indicators such as lymphocytes, antibodies, blood coagulation, complements, and immunoglobulin. The average age of patients in the ineffective group was significantly older than that in the effective group (43.69 ± 16.05 vs 38.21 ± 14.47, Table 1). In terms of biochemistry, H-cholesterol levels, apolipoprotein B (ApoB) levels, and Na+ levels had significant differences between the 2 groups. The levels of C3 were significantly higher, but the levels of IgE were lower in the ineffective group than that in the effective group. In terms of antibodies, the positive rate of antinucleosome antibodies was lower in the ineffective group than that in the effective group.

TABLE 1 - Demographic and Clinical Characteristics of Patients Variable Total (n = 324) Effective (n = 155) Not Effective (n = 169) p value Age 324 (41.07 ± 15.56) 155 (38.21 ± 14.47) 169 (43.69 ± 16.05) 0.0023 Female 290 (290/324 = 89.5%) 142 (142/155 = 91.6%) 148 (148/169 = 87.6%) 0.3156 Male 34 (34/324 = 10.5%) 13 (13/155 = 8.4%) 21 (21/169 = 12.4%) 0.3156 Blood lymphocytes B-lymphocyte 173 (0.095 ± 0.124) 80 (0.09 ± 0.10) 93 (0.10 ± 0.14) 0.4952 Biochemical H-cholesterol 311 (1.01 ± 0.40) 148 (0.97 ± 0.43) 163 (1.04 ± 0.37) 0.0300 ApoB 311 (0.81 ± 0.29) 148 (0.87 ± 0.31) 163 (0.77 ± 0.25) 0.0093 Na 318 (140.10 ± 4.91) 152 (139.48 ± 3.84) 166 (140.67 ± 5.66) 0.0121 ALT 320 (38.60 ± 65.25) 154 (41.91 ± 67.48) 166 (35.53 ± 62.95) 0.8913 ALP 320 (92.02 ± 91.82) 154 (88.44 ± 91.43) 166 (95.34 ± 92.05) 0.1877 Complement C3 282 (0.70 ± 0.29) 134 (0.67 ± 0.30) 148 (0.73 ± 0.28) 0.0416 Immunoglobulin IgA 282 (2.57 ± 1.38) 134 (2.66 ± 1.27) 148 (2.49 ± 1.46) 0.0934 IgE 282 (245.58 ± 382.61) 134 (292.86 ± 422.32) 148 (202.78 ± 337.04) 0.0263 Blood coagulation FIB 306 (2.88 ± 1.013) 146 (2.89 ± 1.01) 160 (2.88 ± 1.02) 0.9345 Antibodies ACA-IgG 11 (11/155 = 7.1%) 5 (5/72 = 6.9%) 6 (6/83 = 7.2%) 1.0000 ACA-IgM 6 (6/155 = 3.9%) 0 (0/72 = 0) 6 (6/83 = 7.2%) 0.0505 AnuA 74 (74/213 = 34.7%) 45 (45/98 = 45.9%) 29 (29/115 = 1.7%) 0.0159 AMA-M2 31 (31/224 = 13.8%) 11 (11/103 = 10.7%) 20 (20/121 = 16.5%) 0.2080

Data are mean ± SD or n (%), where n is the total number of patients with valid data in each group. Statistical analysis was performed with the Mann-Whitney U test and the χ2 test.

ALP, alkaline phosphatase; FIB, fibrinogen; AnuA, antinucleosome antibody; AMA-M2, antimitochondrial antibody-M2.

Furthermore, we conducted a subgroup analysis to examine the significance of differences among patients classified into mild, moderate, and severe groups. The results indicated that there were no significant differences in the characteristics of the 2 groups in the severe patients. Among the moderate patients, only alkaline phosphatase showed a significant difference in characteristics. However, within the mild patient group, several characteristics, including age, ApoB, Na, and C3, exhibited significant differences (Supplementary Table S1, https://links.lww.com/RHU/A662). Overall, the factors influencing treatment outcomes vary depending on the severity of the disease. Moreover, correlation analysis of indicators showed that treatment ineffectiveness was positively correlated with age and C3 level, which indicated that the older the age and the higher the C3 level, the higher the possibility of SLE-ITP ineffectiveness (Supplementary Fig. S3, https://links.lww.com/RHU/A662).

Analysis of Treatment Effect Factors

Next, we performed a Cox regression analysis on the cohort of patients with SLE-ITP based on the treatment response criteria. Supplementary Figure S4, https://links.lww.com/RHU/A662, showed that the HR of B-lymphocyte count, ApoB, H-cholesterol, immunoglobulin A (IgA), and fibrinogen was less than 1, indicating that these indicators were good factors for the curative effect of SLE-ITP. The HR of gender, C3 level, anticardiolipin antibodies (ACA)–IgG, ACA-IgM, and AMAM2 was greater than 1, which showed that these indicators were unfavorable factors for the curative effect of SLE-ITP.

To further investigate the effect of different levels of the above factors on the efficacy of SLE-ITP, we stratified statistics and performed univariate analysis for indicators with significant differences (Supplementary Table S2, https://links.lww.com/RHU/A662). According to the results of the stratified statistical test in Supplementary Figure S5, https://links.lww.com/RHU/A662, the impact indicators on SLE-ITP could be roughly divided into 4 categories: B-lymphocyte count and alanine transaminase (ALT) was 1 type, and these indicators were bad factors as long as abnormal; ApoB and IgA was 1 type, and the lower the value of these indicators, the worse the treatment effect; age was 1 type, and the higher its value, the worse the treatment effect; ACA antibody was 1 type, and if these indicators were positive, the treatment effect would be worse. The 2 Kaplan-Meier treatment effect curves for normal and abnormal B-lymphocyte counts (Supplementary Fig. S6A, https://links.lww.com/RHU/A662) showed that the treatment probability curve for abnormalities was below normal, and there was a significant difference between the 2 sets of curves, indicating that patients with abnormal B-lymphocyte counts usually have suboptimal treatment effects. In addition, the 2 sets of Kaplan-Meier curves bounded by the median of the reference range of ApoB levels (Supplementary Fig. S6B, https://links.lww.com/RHU/A662) showed that, when the ApoB levels were below the median, the probability of patients having effective treatment decreased significantly. Furthermore, we found a relationship between ACA and the therapeutic effect of SLE-ITP, and all 6 positive patients showed ineffective therapeutic effect in this study. The Kaplan-Meier treatment effect probability curve of antibodies with high SLE specificity (Supplementary Fig. S7A–D, https://links.lww.com/RHU/A662) showed that antinucleosome antibody, anti–ribosomal P protein antibody, and anti–dsDNA antibody with significant differences between 2 groups (p < 0.05), and antimitochondrial antibody-M2 had no effect on the treatment effect of SLE-ITP.

Prediction of Treatment Effect

An NN model was built to predict the treatment effect of patients based on the above indicators in this section. For numerical features, standardized normalization was used; for subtype features, label-coding was directly used since there were only 2 cases of taking values (Supplementary Table S3, https://links.lww.com/RHU/A662). Excluding samples with missing values in the above features, we divided the original sample (119:70(1) + 49(0)) into a training set (95:57(1) + 38(0)) and a verification set (24:13(1) + 11(0)) in a ratio of 8:2 set (24:13(1) + 11(0)) randomly.

Four evaluation indicators based on the confusion matrix are used to evaluate the classifiers, namely, accuracy, sensitivity, specificity, and Youden index (YI). The results (Supplementary Table S4, https://links.lww.com/RHU/A662) of the validation set showed that the NN model specificity reached 0.92, whereas the sensitivity was only 0.73, and the overall accuracy was 0.83. Obviously, this might be caused by the uneven ratio of invalid and valid patients in the training data set, for which we introduced cost-sensitivity in the NN and established a cost-sensitive NN model to increase the penalty for less class samples, which in turn served to increase the side the number of less class samples to solve the sample imbalance problem (Fig. 2). The validation results of the cost-sensitive neural network (CSNN) model showed that the sensitivity decreased to 0.85, but the specificity was improved to 0.82, and the 2 reached a certain balance. Although the overall accuracy did not change, the YI improved, indicating that CSNN can solve the sample imbalance problem to some extent.

F2FIGURE 2:

Framework diagram of treatment effectiveness prediction model.

To achieve data enhancement, we proposed a small-sample classification method based on VAE and NN classifiers, namely, the VAENN model. The method flow was as follows (Fig. 2): (1) train the VAE model using the training set and optimizing the model parameters; (2) generate synthetic samples using the generator of VAE (1500 iterations, 800 generated samples); (3) train the NN classifier using down-sampling (effective: 296, ineffective: 296) based on the generated samples; (4) validate the VAENN performance using the validation set.

Supplementary Table S5, https://links.lww.com/RHU/A662, showed that the means of the generated data and the original data were very close, whereas there were some errors for some indicators. The standard distribution results of several key indicators showed that the numerical distributions of IgE level and B-lymphocyte count were very close (Supplementary Fig. S8A–B, https://links.lww.com/RHU/A662). On the classification indicators, especially about the treatment effect of patients, the proportion distribution of generated data and the original data were very close (Supplementary Fig. S8C–D, https://links.lww.com/RHU/A662), which showed that VAE had certain applicability and effectiveness in data simulation enhancement. The prediction results showed that the specificity of the NN model enhanced by VAE data was as high as 0.91, and the overall accuracy rate was as high as 0.88, which showed that data enhancement by VAE could improve the robustness of the model and solve the data imbalance to a certain extent. In order to further test the performance of the VAENN classifier, the study compared it with traditional classifiers. Supplementary Table S4, https://links.lww.com/RHU/A662, showed the sensitivity point of view, except for the logistic regression (LR) model, which was below 0.8, the specificity of all the classifiers was higher than 0.8, and the NN and support vector machines were as high as 0.92, which was higher than the other classifiers. From the viewpoint of specificity, the difference between the models was obvious, with random forest (RF) and gradient boosting (GB) having a specificity of 0.64, and VAENN having the best performance with 0.91. From the perspective of overall accuracy, the NN classifier achieved better learning results than traditional machine learning models, and the accuracy rate was better than other classifiers, with VAENN predicting the best performance, up to 0.88. The VAENN model also performed best in terms of the Jorden index, up to 0.76.

To highlight the effectiveness of the improved models, we plotted the ROC curves comparing multiple classifiers (Fig. 3). In terms of curve shape, the closer the ROC curves of LR and NN models were to the upper left corner, the better the classification effect; conversely, the closer the ROC curves of GB, k-nearest neighbor (KNN), and RF models were to the lower right corner, the worse the classification effect. In terms of curve area, except for the GB and KNN models, all the models had area under curve over 0.8, which is considered to have better classification ability, among which, CSNN and VAENN had the largest area under curve values, indicating that the models had the best classification effect.

F3FIGURE 3:

ROC of several classifiers. SVM, support vector machine; VAENN, neural network based on variational autoencoder.

To further validate the effect of our synthetic data on enhancing the model performance, we used machine learning models trained on synthetic data and validated on the validation set (Table 2). The results showed that after data augmentation, the specificity of all models had been improved in different degrees compared with the previous models, with RF and NN improving the most, by 18%. In terms of accuracy and YI, except for the KNN model, which had a slight decrease, this is because KNN is based on the distance between data for modeling classification, which requires high quality of synthetic data, the accuracy and YI of all other models have also been improved, and all in all the results showed that the synthetic data could effectively solve the problem of data imbalance.

TABLE 2 - Comparison of Evaluation of Classifiers Based on Generated Datasets Model Sensitivity Specificity Accuracy YI VAE + RF 0.85 (+0) 0.82 (+0.18) 0.83 (+0.08) 0.68 (0.19) VAE + GB 0.77 (−0.08) 0.73 (+0.09) 0.75 (+0) 0.5 (+0.01) VAE + LR 0.77 (+0) 0.91 (+0.09) 0.83 (0.04) 0.68 (+0.08) VAE + KNN 0.77 (−0.08) 0.73 (+0) 0.75 (−0.04) 0.5 (−0.08) VAE + SVM 0.85 (−0.07) 0.82 (+0.09) 0.83 (0) 0.67 (+0.02) VAE + NN 0.85 (−0.07) 0.91 (+0.18) 0.88 (+0.05) 0.76 (+0.09)

SVM, support vector machine; VAENN, neural network based on variational autoencoder.


DISCUSSION

Previous research has provided some guidance for physicians in assessing the effectiveness of SLE-ITP patients.30 Specifically, studies have indicated that deficiencies or abnormalities in ApoB, a protein involved in lipid metabolism, may contribute to coagulation disorders and bleeding tendencies.31 In this study, ApoB was a good efficacy factor in the Cox multifactor analysis, and stratified analysis revealed that low values of ApoB adversely affected the therapeutic effect of SLE-ITP, suggesting a potential link between ApoB and thrombocytopenia. In addition, the number of ineffective people with ALT above normal levels was lower than the number of effective people in the study, but the Cox univariate analysis showed that ALT above normal levels was a bad influence factor. This highlights the potential impact of elevated ALT levels on treatment effectiveness. In some clinical studies, it is often assumed that low C3 levels are indicative of a more severe disease state. To better understand the relationship between C3 levels and treatment response in SLE-ITP patients, we used restricted cubic spline analysis (Supplementary Fig. S9, https://links.lww.com/RHU/A662). The results demonstrated a linear association between C3 levels and HR, with an optimal cutoff of 0.719, at which the HR was 1, indicating no difference in treatment response. Overall, these findings emphasize that factors influencing the efficacy of SLE-ITP treatment can be affected by sample size limitations, potential interactions within multifactorial analyses such as Cox regression, as well as missing data and selection bias. Therefore, future research should use more robust methodologies to determine the most reliable factors influencing the effectiveness of SLE-ITP treatment.

Artificial intelligence has made great strides in recent years and has shown great promise in a variety of fields, including healthcare.32 However, combined with clinical observations accumulated over the years, it is important to recognize that there are many contradictions in AI-assisted healthcare33 (Supplementary Fig. S10, https://links.lww.com/RHU/A662). The first is the contradiction between automated decision-making and manual intervention; in some complex cases,34 such as rare diseases and individual differences, doctors' professional knowledge and experience are still indispensable in the decision-making process. The second is the contradiction between data sharing and data security; how to achieve effective sharing and utilization of medical data while protecting patient privacy is a point of conflict that needs to be balanced.35 Further, the lack of transparency in AI-based decision-making makes it difficult to explain the reasons behind a particular recommendation or diagnosis,36 and the degree of importance of features derived from different algorithms often varies (Supplementary Fig. S11, https://links.lww.com/RHU/A662), leading to a contradiction between the reliability of medical decision-making and clinical observation. In addition, there are contradictions between AI's technical feasibility and clinical practicality, as well as liability and legal issues.37 Overall, AI has the potential to enhance clinical practice and improve patient outcomes. However, integrating AI into healthcare requires careful consideration of these contradictions and a balanced approach that combines the benefits of AI with the expertise and judgment of healthcare professionals to lead to more effective and personalized care.38

Thrombocytopenia in SLE is heterogeneous. It may be the first manifestation of SLE by months or even years and appears to have a good prognosis. Thrombocytopenia in SLE can be mild, moderate, or severe, with different therapeutic responses. Because of the limitations of the total amount of data, the number of patients in each category is not balanced, and the accuracy of differentiation based on AI models is not ideal. In the future, as the quality and quantity of data improve, we will conduct more detailed studies on the relationship between disease severity and treatment effects. Also, the limitations of this study are as follows: first, the bias in data selection may lead to some errors in the study results. Second, some important factors may be missed during the analysis due to the missing values for some indicators. Finally, the quality of the synthetic data depends on the original dataset, and as the complexity of the data increases, the synthetic data may not represent the real data and will lead the model to learn wrong insights.

In conclusion, the study analyzed the factors influencing the efficacy of SLE-ITP by defining the “treatment time” of patients, and the results showed that B-lymphocyte count, H cholesterol level, and complement C3 level could be used as predictors of SLE-ITP efficacy, and abnormal levels of ALT, IgA, and ApoB indicated poor treatment response, and the accuracy was better than the traditional machine learning models by establishing an NN model based on cost-sensitive and VAE for predicting the treatment effect of patients.

KEY POINTS B-lymphocyte count, H-cholesterol level, C3 level, anticardiolipin antibody, and so on could be used as predictors of SLE-ITP curative effect. Abnormal levels of ALT and low levels of IgA and ApoB indicated adverse treatment response. The NN treatment effect prediction model was based on cost-sensitivity and VAE with an overall accuracy rate closed to 0.9 and a specificity of more than 0.9. ACKNOWLEDGMENTS

The authors thank all members of Jiangsu Lupus Collaborative Group who followed up the patients and helped with data collection. They acknowledge the Cinkate Corp for helping in building and managing their lupus database.

REFERENCES 1. Kistangari G, McCrae KR. Immune thrombocytopenia. Expert Rev Hematol. 2013;27:495–520. 2. Jiang Y, Cheng Y, Ma S, et al. Systemic lupus erythematosus-complicating immune thrombocytopenia: from pathogenesis to treatment. J Autoimmun. 2022;132:102887. 3. Sekhon SS, Roy V. Thrombocytopenia in adults: a practical approach to evaluation and management. South Med J. 2006;99:491–498. 4. Kado R, McCune WJ. Treatment of primary and secondary immune thrombocytopenia. Curr Opin Rheumatol. 2019;31:213–222. 5. George JN, Aster RH. Drug-induced thrombocytopenia: pathogenesis, evaluation, and management. Hematology Am Soc Hematol Educ Program. 2009;153–158. 6. George JN, Raskob GE, Shah SR, et al. Drug-induced thrombocytopenia: a systematic review of published case reports. Ann Intern Med. 1998;129:886–890. 7. Cines DB, Liebman H, Stasi R. Pathobiology of secondary immune thrombocytopenia. Semin Hematol. 2009;46(1 Suppl 2):S2–S14. 8. Mestanza-Peralta M, Ariza-Ariza R, Cardiel MH, et al. Thrombocytopenic purpura as initial manifestation of systemic lupus erythematosus. J Rheumatol. 1997;24:867–870. 9. Jung JH, Soh MS, Ahn YH, et al. Thrombocytopenia in systemic lupus erythematosus: clinical manifestations, treatment, and prognosis in 230 patients.

留言 (0)

沒有登入
gif