Artificial neural network - an effective tool for predicting the lupus nephritis outcome

Data collection

It was a single centre trial, including retrospective data of 58 patients with diagnosed systemic lupus erythematosus and biopsy-proven LN. The SLE diagnosis was based on EULAR/ACR classification criteria [7]. The following clinical parameters were included: age, gender, serum creatinine concentration, estimated glomerular filtration rate (eGFR) calculated by MDRD equation, C3 and C4 concentrations, serum albumin, extent of proteinuria measured as urine protein to creatinine ratio (UPCR), erythrocytes sedimentation rate (ERS), C-reactive protein (CRP) concentration, erythrocyturia assessed as number of red blood cells (RBC) on high-power field (HPF),

All parameters were collected at the time of kidney biopsy. Only patients with significant proteinuria (assessed as UPCR > 1.0 mg/mg) were included into study group. After 6 months of follow-up, a complete remission (CR) of LN was defined as UPCR < 0.5 and stable renal function, according KDIGO guidelines [8]. All patients were treated according to EURO-LUPUS regimen, using 6 intravenous pulses cyclophosphamide (500 mg each), followed by oral mycophenolate mofetil, unless contraindicated [9].

Statistical scoring

The performance of the artificial neural network models was assessed with the following statistical indicators: area under the receiver-operator curve (AUROC), Accuracy, Precision, Recall and F1-Score. AUROC was used to assess the discriminant power of the artificial neural network.

Artificial neural network

The entire project was created and run in the python 3.6.8 environment. Incomplete rows, containing blank cells, were removed from the original database, allowing reduction of the amount of available data, but got 100% complete dataset. In our previous work we analysed mostly random forest classifiers, due to their better performance against neural networks [10]. An artificial neural network is a complex structure consisting of several basic units, called artificial neurons. In its simplest form, there are perceptrons containing several inputs, with assigned weights and one output. Functions responsible for building a multi-layer perceptron came from the scikit-learn library. It is, to some extent, analogous to a biological neuron with many dendrites but only one axon. The interior of the perceptron is an activating function, superimposed on the sum of the products of the neuron’s inputs and the corresponding weights. The bias vector affects performance and results in better fitting to the data. Neurons are arranged in layers that are interconnected. In a multi-layer perceptron, these layers are organized in the input layer, hidden layers, and output neurons. Depending on the number of neurons and layers, different complexity may be obtained. Naturally, the greater the complication, there more of the possibilities of such network, but at the same time, the more time cost needed to train it.

$$=\text}_}\left(\text\text\text\text+\sum _=1}^}\text\text\text\text}_}\bullet \text\text\text\text\text}_}\right)$$

The activation function is analogous to the excitability threshold of a biological neuron. In MLP, this is a ReLu function that returns zero for all non-positive values and takes the input value for positive values.

$$\left(\text\right)=\text\left(0,\text\right)=\left\\text\\ 0~\text\end\right.$$

The activation function for the output in MLP is the logistic function, given by the following formula:

$$\sigma \left(x\right)=\frac^}$$

The complexity of the MLP neural network is related to the number of samples in the training set, the number of input features, predicted classes, and neurons in the respective layers. In mathematical notation it is written as O (n·m·o·h1·h2), where “n” is the number of samples in the training set, “m” is the number of input features, and “o” is the number of predicted classes. The sizes of the hidden layers are h1 and h2, respectively, and they denote the number of iterations leading to the best model.

The completed database has been recursively split into subsets per column. For example, the subsets contained data for all patients, but only for selected columns. The selection of input parameters was based on recursive searching of the subset space, individual evaluation of each statement, selection of hyperparameters and evaluation on the test set. Initially, we thought about applying heuristics to optimize models, but with a cut-off size of 1 to 45 neurons in the hidden layer, we did not experience an appreciable loss of resources, using brute force search. Naturally, we are aware that heuristics in model optimization are necessary in more advanced models and for larger input data. The search for optimization solutions for modelling in medicine can be an interesting subject of research and bring enormous progress in the field of personalized medicine. The main hyperparameters of the neural network are the number of neurons in the individual hidden layers. Due to the speed of calculations and their parallelism, we used a for-loop nested in the for-loop and limited the maximum number of neurons to 150 in a single layer. We are aware that the complexity was high, but in practice we were able to trace how the performance of the network changes depending on its structure, which, however, is not the subject of this work, but is discussed in another of our work [10]. The performance measured by AUROC, and Accuracy has been saved and finally the best configurations was chosen, allowing the most accurate prediction of total remission.

留言 (0)

沒有登入
gif