Personalized Spiking Neural Network Models of Clinical and Environmental Factors to Predict Stroke

To model the differences between the patterns of low and high risk of environmental data for each person, personalized models were created separately for 804 individuals from the data set. Each PSNN \(x\) model of a person \(x\) was trained in our experiment with a time-window Te of 7-day environmental data of a group of k nearest neighboring individuals to this person (selected using WWKNN method) and then was tested 7 times using different lengths of the environmental samples from \(i\) (testing data length varied from 7-day period to 1-day period, prior to stroke occurrence). Figure 3 depicts that when PSNN models were tested with 7-day environmental samples prior to the stroke, the high-risk and low-risk samples were correctly classified for 488 individuals. However, the number of individuals reduced when the PSNN models were tested using a smaller time-length (a 6-day to 1-day period) for prediction of stroke occurrence on the 7th day. The findings in Fig. 3 suggest that this subset of 488 individuals’ models showed associations between 7-day environmental data changes and their risk of stroke, forming a subgroup of individuals \(G\). Our hypothesis is that every new individual who has similar clinical variables to the population \(G\) of individuals can benefit from a PSNN to predict their stroke risk using 7 days of environmental data. For the rest of 804–488=316 individuals, other suitable PSNN models should be explored, using a larger window \(Te\) of environmental data (e.g., 8, 9, 10, …,20 days as suggested in [47]). Here, for each time-window, a separate subgroup of individuals can be identified that associates their clinical variables with the environmental variables during this time-window. We have studied what clinical variables define the subgroup \(G\) of 488 individuals for which 7 days of environmental variables can be used to predict their risk, in contrast to the rest 316 individuals. This study is important for the future applicability of the proposed method in clinical practice.

Fig. 3figure 3

(a) The design of the testing data (environmental time-series in our case from 7 days to 1 day of data). (b) The PSNN models differentiated the “high-risk environment” vs “low-risk environment” for 488 individuals when tested with 7 days of environmental data prior to stroke occurrence. This indicates that there is an association between the 7-day environmental changes and the risk of stroke occurrence for a subgroup of 488 individuals in the whole population. The number of individuals with the correct prediction of low-risk environmental period (risk of stroke) was reduced when the length of the testing environmental time-series was shortened from 7 days to 1 day

As stated earlier, every PSNN model was tested 7 times using different lengths of the environmental period prior to the stroke; hence, among these 488 individuals, a subset of individuals whose high-risk environmental periods were detected correctly in at least 4 rounds out of these 7 testing rounds (e.g., 1,2,3 and 4 days before the stroke) was selected as a group of strongly affected patients by current environmental changes. This subset represents those individuals who experienced the effect of causal interactions in longitudinal environmental time-series with their personal, clinical data that contributed strongly to increasing their risk of stroke. As a result, 169 individuals were selected for further quantitative analysis of their PSNN models. Therefore, the whole 804 individuals were categorized into two groups: (1) the affected group (AG) of 169 patients (accurate prediction of at least 1, 2, 3, and 4 days before the stroke) and (2) the non-affected group (NAG) of 635 patients.

To identify the between-group differences, we analyze the distribution of the patients (in percentage) in the affected and non-affected groups with respect to their family health history (Fig. 4a) and their personal health history (Fig. 4b). Figure 4c represents the differences in the mean value of some clinical health features in the AG vs NAG.

Fig. 4figure 4

Clinical records of patients in two groups: affected vs non-affected groups by environmental changes. (a) The number of individuals with a history of health issues in their family records shows that the most of them had family members who had a stroke in the past; (b) the number of individuals with history of a health issues in their personal, clinical records shows a higher level of cholesterol, diabetes, vascular/heart disease, comorbidity, serous full, and medication for the affected group; (c) the mean value of the last measured personal, clinical health variables show a greater values in features age (over 65), weight, systolic blood pressure (over 155 mm of mercury—mmHg), and diastolic blood pressure (over 80 mmHg) for the affected group

Our findings suggest that the risk of stroke in the studied population was associated with certain environmental changes when the individuals belonged to a defined cluster of the following clinical risk factors: a family health history factors (stroke in family, diabetes in the family; depicted in Fig. 4a); personal health history, high cholesterol, vascular/heart disease (depicted in Fig. 4b); and greater values in age, weight, and blood pressure (depicted in Fig. 4c).

To investigate how the interactions between environmental variables during the chosen time-window of 7 days before stroke affected an individual risk of stroke, we built personalized models for each of these 169 patients to capture the within-group differences of high-risk vs low-risk environmental periods. Here, for every individual \(x=\\), we selected a cluster of patients using the WWKNN method concerning their clinical data similarity. The size of the selected cluster is different for each of these 169 individuals, depending on the density of the similar individuals in the neighborhood radius. Figure 5 plots the number of \(k\) similar samples to each of these 169 individuals, selected for building 169 PSNN models. Each created PSNN model was trained with two sets of environmental time-series (from high-risk and low-risk classes) that belong to the \(k\) nearest individuals to an individual \(x\). These environmental time-series were encoded into spikes to demonstrate certain upward and downward changes in the values of environmental features over 7-day periods in both high and low-risk intervals.

Fig. 5figure 5

For each of the personalized models of individuals \(x =\\)(shown on the y-axis), \(k\) neighboring samples selected with respect to a neighborhood radius \(r\) which is an adaptive threshold (\(r= \mu + \sigma\)) and is a different value for each personalized model. This led to select an optimal value for \(k\) in each personalized model (k is shown on the x-axis) and on average, k = 57.5

Figure 6a depicts the average of positive and negative spikes derived from the 7-day environmental data in high-risk samples. This represents that in the high-risk environment, the values of CO, NO2, O3, SO2, PM10, and PM2.5 have been increasing more than decreasing, therefore, generating more positive spikes than negative. On the other hand, the values of temperature, wind-speed, wind-direction, and solar radiation, which are inter-related climatic conditions, have been decreasing more than increasing. These patterns demonstrate the associated environmental changes over 7 days before stroke occurrence that influenced the risk of stroke for these 169 affected patients in Auckland in 2011–2012. Except for O3, the mentioned pollutants are mainly generated because of burning fossil fuels. The presence of NO2 and SO2 together with water and oxygen will result in the production of nitric, nitrous, and sulfuric acids. Particulate matters (PM), especially PM2.5, due to their small size can penetrate the lungs, which triggers respiratory diseases [48]. These particles can also enter the blood circulation system that may lead to chronic diseases and cause vascular inflammation and hardening of arteries that may result in ischemic stroke or heart attack [49,50,51]. Our findings in Fig. 6a are in alignment with the literature that suggested PM2.5 as a risk factor of stroke occurrence [49, 52]. Figure 6a also reported an association between the ozone (O3) increase and the high-risk period of stroke occurrence. Ozone sis an allotrope of oxygen that can be generated by short wavelengths of the ultraviolet spectrum, particularly UV-C (200–280 nm) and vacuum UV (100–200 nm) [53]. Ozone was seen to alter blood coagulation mechanism and cause irregular heart rate and systemic inflammatory responses [54, 55] and hence was reported in the literature to be in association with stroke occurrences [56, 57].

Fig. 6figure 6

(a) The number of positive and negative spikes (mean values) related to the increases and decreases in environmental time-series for the high-risk period, averaged across all the 169 individuals. (b) The level of influence (causal relationship) that one variable has on the others over 7 days of high-risk (in orange color) and low-risk (in blue color)

The encoded spikes from 7-day environmental data were used as input data for training PSNN models. The environmental features were mapped into a 3D PSNN model that topologically preserves the temporal differences of the data features. This is performed by computing the correlation between the spike trains of all the 10 environmental features. The most correlated features are mapped to closer input neurons inside the PSNN.

For each of the 169 individuals in the affected group, we developed two separate PSNN models to map and model the temporal environmental changes of the high- and low-risk periods and study the differences. The PSNN models were spatially mapped into the 3D space of spiking neurons and trained environmental time-series. The mapped PSNN models learned the temporal associations “hidden” between the environmental features during the unsupervised STDP learning algorithm [20] while learning from 7-day data. Figure 6b shows the level of causal interactions that each environmental feature has with other features during the 7 days, averaged across all the 169 PSNN models in high risk (red) vs low risk (blue). This shows a greater causal interaction in high-risk than the low-risk period reflecting the associated environmental risk factors.

When the PSNN models are learning from environmental data using the unsupervised STDP learning algorithm [20], the spatio-temporal relationships between the features are formed as weighted connections.

Figure 7 illustrates the absolute value of positive and negative connection weights in the PSNN models of 169 individuals, trained by high-risk (in a) and low-risk (in b) environmental data. By comparing Fig. 7a and b, the absolute value of connections is higher in the high-risk period than in the low-risk period. It may suggest that frequent fluctuations in environmental features might be considered as external risk factors to increase the risk of stroke occurrence. For statistical analysis, we extracted the quantitative information of the connection weights from 169 patients’ PSNN models of high-risk and low-risk environments and used ANOVA to measure the t-test \(p\)-values as reported in Table 1.

Fig. 7figure 7

The sum absolute value of positive and negative connection weights in each of the trained PSNN models (for 169 patients) in high risk (red) vs low risk (blue)

Table 1 A t-test \(p\)-value demonstrates the significant difference between the level of interactions for each environmental variable across 169 patients’ models in high risk vs low risk. Variables SO2 and PM10 have shown the lowest \(p\)-values followed by CO and PM2.5 variables, representing the most important variables for discriminating the two groupsPersonalized Profiling of Individual Risk of Stroke Using Environmental Data

The study of interactions among environmental variables over time, related to personal data before stroke occurrence, is a challenging task as several variables can influence the other ones, either directly or indirectly. Here, the proposed personalized modeling method and system offered a capable and explicable profile of an individual to explain the relationships between environmental variables that potentially increased an individual’s risk of stroke for a person or a group of persons. Using the proposed PSNN method and system, we can create a personalized profile for each person that results in an improved understanding of personal factors that increased the risk of stroke. Figure 8a represents the PSNN models (trained by high-risk and low-risk environmental time-series) of a 21-year-old (female) patient who had a stroke on 18 Nov 2011 in Auckland, NZ. The PSNN models demonstrated that the spatio-temporal relationships between the environmental variables are different in high-risk vs low-risk environments for this patient with the following conditions: epilepsy, head injury, migraine, and family history of heart attack, hypertension, and diabetes.

Fig. 8figure 8

(a) PSNN models were trained by 7-day environmental data in high-risk and low-risk periods for one randomly selected patient (21-year-old (female) who had a stroke on 18 Nov 2011 in Auckland, NZ) and had the following conditions: epilepsy, head injury, migraine and family history of heart attack, hypertension, diabetes. (b) Feature interaction network (FIN) shows the level of interactions between environmental features during the 7 days. (c) Percentage of the activated neurons in PSNN models presenting environmental variables is indicating the importance of these variables for stroke prediction within the cluster of patients closer to the selected individual

The amount of spatio-temporal interactions between these environmental variables (shown in Fig. 8a) is measured by a feature interaction network (FIN) graph, illustrated in Fig. 8b. For this patient, the FIN graph of high risk represents large interactions between variables NO2, wind-direction, and PM2.5; variables PM10 and PM2.5; and variables O3, solar, SO2, and temperature which explain how the changes in some features influenced the changes in other features over 7 days before the stroke. On the other hand, different level of interaction was measured in the low-risk environmental period for this patient. These findings are personalized and can be different for another patient, suggesting that the proposed PSNN modeling is a promising approach of capturing individual characteristics that can potentially lead to customization of healthcare, decision-making, treatments, and practices as the models are being tailored to individual information.

Figure 8c shows that the data from high-risk and low-risk environmental periods demonstrated different activated areas (shown in %) around each environmental feature in the PSNN models. A larger activated area around an environmental feature refers to stronger influential changes in the value of this feature during the 7 days of high-risk (Fig. 8c-left) and low-risk (Fig. 8c-right) environments. This refers to important environmental markers in increasing the risk of stroke occurrence for an individual.

Figure 9 presents the personalized profiles of another two randomly selected patients from two clusters of subjects with the following information: age > 70, a family history of stroke, high cholesterol, diabetes, vascular/heart disease. These patients had a stroke on 21 Apr 2011 and 30 Jan 2012 respectively in Auckland, NZ. The models were separately trained with 7-day data of high-risk environmental periods related to KNN individuals to these patients. The right-side graphs show the temporal/causal interactions between the environmental features as important measurements for the identification of environmental changes that influenced the risk of stroke.

Fig. 9figure 9

Personalized profiling of two patients who had a stroke on (a) 29/Apr/2011 and (b) 30 Jan 2012 in Auckland, NZ, belonging to two clusters of subjects with the following information: age > 70, family history of stroke, high cholesterol, diabetes, vascular/heart disease; (left) PSNN connectivity trained with high-risk environmental data (encoded spikes from 7-day data). (Right) Feature interaction network shows the interactions between environmental features over 7 days, where the nodes represent the features, and the thickness of the lines shows the amount of information exchanged between them over time

Figure 9a demonstrates great interactions between PM10 and PM2.5 and NO2, also, between the temperature, solar, and wind-speed during the 7 days in the high-risk period. Figure 9b illustrates great interactions between PM10 and PM2.5, also, between the temperature, solar, and O3 during the 7 days in the high-risk period.

留言 (0)

沒有登入
gif