Lattice 123 pattern for automated Alzheimer’s detection using EEG signal

The self-organized AD detection model has the following layers: (1) feature extraction comprising EEG signal decomposition using MDWT (this enabled downstream multilevel feature generation, thereby mimicking deep learning) and Lattice123-based feature engineering (see section "Dataset"); (2) INCA feature selector (Tuncer et al. 2020b) to remove redundant features, thereby reducing data dimensionality; (3) a standard shallow k-nearest neighbor (kNN) classifier (Peterson 2009) to calculate channel-wise results; (4) IHMV (Dogan et al. 2021) to generate additional channel-wise voted feature vectors; (5) a greedy algorithm to calculate the best channel-wise results; and (6) IHMV plus greedy algorithm to generate additional overall voted prediction vectors and to calculate the overall best results, respectively. Our model was implemented in the MATLAB (2021a) programming environment on a computer with 16 GB memory, an Intel i7 7700 processor, and a Windows 11 operating system. The graphical clarification of the proposed Lattice123 pattern-based has been given in Fig. 1. The steps involved in each of these layers are detailed in the following subsections.

Fig. 1figure 1

Block diagram of the proposed model: a model overview and b Lattice123-based feature extraction. In this work, we have generated two paths (maximum and minimum) by deploying the probabilistic way generation function, applying three feature extraction functions, and generating 6 (= 3 × 2) feature vectors

The abbreviations of this figure are as follows. AD: Alzheimer’s disease, F: concatenated extracted feature vector, f: extracted feature vector, HC: healthy control, L: low-pass filter wavelet bands, s: selected feature vector.

In this work, each EEG record contained 59 channels, each producing a spatially unique signal utilized as an input signal to the model. MDWT was applied to each signal, and four wavelet bands were generated, corresponding to four low-pass filter coefficients. The raw EEG signal and the four wavelet bands underwent Lattice123-based feature extraction to generate six feature vectors each. INCA was then applied to the generated six feature vectors to create six selected feature vectors for each signal, which were input to the kNN classifier to calculate six predicted vectors. IHMV was then applied to the predicted vectors to generate voted predicted vectors. The greedy algorithm was implemented to select the final predicted vector, representing the best channel-wise result. The 59 channel-wise final predicted vectors generated per EEG record were next input to the IHMV function to generate more voted vectors, from which the best overall binary classification result was selected using the greedy algorithm.

Lattice123 pattern

In graph-based feature engineering, features are generated using kernel function operations within the framework of either fixed patterns (Subasi et al. 2021; Tuncer et al. 2021a, 2021b) or adaptive patterns that are dynamically generated based on the signal input (Jiang et al. 2022; Tuncer et al. 2020a). In feature engineering, conventional feature extraction functions are employed as static patterns to generate features. However, these static patterns are limited in producing meaningful features from certain data blocks. Therefore, a dynamic feature extractor is needed to extract the hidden patterns from each block. In this research focus, we utilized the novel Lattice123 process (Fig. 2) to generate two directed graphs using a probabilistic walking path detection function.

Fig. 2figure 2

The used lattice for the graph generation. There are one (v1), two (v2 and v3), and three (v4, v5, and v6) vertexes in the top three tiers, which explains its name: Lattice123. In this research, we have used a nine-leveled Lattice123 Pattern. Therefore, we have used 19 vertexes

The lattice used for graph generation is shown in Fig. 2. The patterns (graphs) are determined using this lattice, which comprises 19 numbered vertexes (v) and 28 directed edges (all angled downwards). First, the vertexes were populated sequentially by bit values in the input signal block. Maximum and minimum walking paths starting and ending at v1 and v19 were then calculated to generate two directed graphs for downstream (walking way) feature extraction. Histogram-based features have been extracted using the generated graphs. Therefore, the presented feature extraction model is named the Lattice123 pattern. The overview of the Lattice123 pattern is shown in Fig. 3.

Fig. 3figure 3

Overview of the Lattice123 pattern. In this work, we have used a one-dimensional signalsix, and we have obtained six feature vectors, and the length of each feature vector is equal to 256

The presented Lattice123 pattern is a histogram-based feature extraction algorithm, and the steps of this algorithm are given below:

1.

Normalize the input signal to integer values between 1 and 100 by deploying min–max normalization.

$$N = \left\lceil }} - S_ }}} \right\rceil \times 99 + 1$$

(1)

where \(N\) represents normalized signal; \(S\), signal value; \(_\), the minimum value of the signal; and \(_\), the maximum value of the signal.

2.

Extract the histogram of the normalized signal.

where \(H\) represents the histogram of the normalized signal; and \(\theta (.)\), the histogram extraction function. In this step, we have extracted a histogram of the normalized signal.

3.

Calculate the probability of each value.

$$pr_ = \frac }}^ H_ }},i \in \$$

(3)

where \(p_\) represents the probability of the ith value; and \(n\), the length of the signal.

4.

Divide the signal into overlapping blocks of length 19.

$$s\left( j \right) = S\left( \right), \;i \in \left\ \right\},\;j \in \left\ \right\}$$

(4)

$$v\left( j \right) = N\left( \right)$$

(5)

where \(s\) represents an overlapping block of the input signal, \(S\); and \(v\), the normalized overlapping block.

5.

Calculate the probability matrix using probability values and relationships.

$$M_ = pr_ , k \in \left\ \right\}$$

(6)

where \(M\) represents the probability matrix; and \(_\), the probability of the jth value, where the parent value of the jth value is the kth value.

6.

Using minimization and maximization operations, create two walking paths (directed graphs) from vertex 1 to vertex 19 of the Lattice 123 pattern.

$$_^=argmin\left(__^,:}\right),t\in \,\dots ,8\}$$

(8)

$$_^=argmax\left(__^,:}\right)$$

(9)

where \(w\) represents the walking path. In this work, we have generated two walking paths (\(^\) and \(^\)). By using a probability matrix (\(__^,:}\)) of each data block, we have generated patches and each path has nine values.

7.

Extract feature vectors using the walking paths and three kernels: signum, upper ternary, and lower ternary.

$$\kappa^ \left( \right) = \left\c} & \\ & \\ \end } \right.$$

(11)

$$\kappa^ \left( \right) = \left\c} & \\ & \\ \end } \right.$$

(12)

$$\kappa^ \left( \right) = \left\c} & \\ & \\ \end } \right.$$

(13)

where \(^(.),^(.)\) and \(^(.)\) represent signum, upper ternary and lower ternary kernels, respectively; \(a,b\), the input values of the kernels and we have used signal values as inputs; and \(tr\), the threshold value for the ternary functions, which, in this model, was calculated as half the standard deviation of the signal. Six-bit groups were thus extracted using these three kernels and two walking paths.

$$bit^ \left( t \right) = \kappa^ \left( \left( t \right)} \right),s\left( \left( \right)} \right)} \right), \quad t \in \left\ \right\}k \in \left\ \right\},\quad l \in \left\ \right\},\quad c \in \left\ \right\}$$

(14)

where \(bit\) represents the binary feature array and \(c\): category of the generated bit. Each \(bit\) array contained eight binary features.

8.

Generate feature signals (map signals) using binary-to-decimal transformation.

$$^\left(i\right)=\sum_^bi^\left(t\right)\times ^$$

(15)

where \(m\) represents the map signal. Six map signals were generated.

9.

Extract histograms of the map signals.

$$^\left(i\right)=\theta (^)$$

(16)

Each generated histogram represents a feature vector of length 256 (= 28). Six feature vectors were generated. The proposed Lattice123 pattern generates two graphs for each data block, which have been utilized as a pattern. Moreover, three kernels have been used to extract binary features for each graph. Therefore, this feature extraction method generated 6 feature vectors.

Feature extraction

The MDWT-based decomposition of the raw input EEG signal yielded four wavelet bands. These banded signals plus the raw EEG signal were input to the Lattice123-based feature extraction model. The 11 steps that define the proposed Lattice123-based model are detailed below.

Step 1: Read channel-wise signals from the EEG record of the study dataset.

Step 2: Apply MDWT using Daubechies 4 (db4) mother wavelet filter function to the raw EEG signal to decompose it into four wavelet subbands corresponding to four low-pass filter coefficients.

$$\left[_ _\right]=\vartheta (S)$$

(17)

$$\left[_ _\right]=\vartheta \left(_\right), h\in \,4\}$$

(18)

where \(L\) represents the low-band filter; \(H\), the high-band filter; and \(\vartheta (.)\), the discrete wavelet transform function, \(h\): number of wavelet levels.

Step 3: Extract features from the raw signal and low-pass the wavelet subbands by deploying the Lattice123 pattern.

$$\left[_^ _^ _^ _^ _^ _^\right]=\mathcal(S)$$

(19)

$$\left[_^ _^ _^ _^ _^ _^\right]=\mathcal\left(_\right), t\in \,\mathrm\}$$

(20)

where \(\mathcal(.)\) represents the Lattice123-based feature extraction function,\(S\): EEG signal, and \(f\), the extracted feature vector of length 256. For instance,\(_^\): the first feature vector of the raw EEG signal.

Step 4: Merge the feature vectors according to type.

$$_\left(j\right)=_^\left(j+\left(p\times 256\right)\right), p\in \left\,\dots ,4\right\},q\in \,\dots ,6\}$$

(21)

where \(F\) represents the concatenated feature vector of length 1280 (= 256 × 5). Six concatenated feature vectors were obtained from each channel-wise input signal.

Feature selection

We employed an iterative feature selector, an enhanced version of neighborhood component analysis (NCA), known as INCA (Tuncer et al. 2020b). It is an iterative approach used to determine the optimal number of features. It involves a series of iterations, during which additional features are systematically selected. A loss value calculation function is applied to evaluate the informativeness of the selected feature vectors in each iteration. The process continues iteratively, and the feature vector with the best-computed loss value is ultimately chosen as the final selected feature vector. The steps involved in feature selection are given below.

Step 5: Apply INCA to calculate the qualified indexes of all features in each concatenated feature vector.

$$i_=\varphi (_,y)$$

(22)

where \(\varphi (.)\) represents the neighborhood component analysis feature selection function; \(y\), the real output; and \(id\), the qualified indexes array. The most accurate feature vector was selected using the following operations.

$$\begin & fs_^ \left( \right) = F_ \left( \left( j \right)} \right),\;r \in \left\ \right\}, \\ & \quad k \in \left\ \right\}, \;j \in \left\ \right\},\;v \in \ . \\ \end$$

(23)

$$acc_^ = }\left( ^ ,y} \right).$$

(24)

$$i_=argmax\left(ac_^\right)$$

(25)

$$_\left(k,z\right)=_\left(k,i_\left(z\right)\right),z\in \left\,\dots ,i_+iv-1\right\}$$

(26)

where \(fs\) represents the selected feature vectors; \(acc\), accuracy value; \(}(.)\), the accuracy calculation function; \(in\), index of most accurate feature vector; \(iv\). initial value of loop; \(fv\), the final value of loop; \(s\), the selected final vector.

These equations describe the process of iterative feature selection using the INCA algorithm. The aim is to iteratively select and evaluate feature vectors to identify the most accurate and informative features for further processing. The loop range is set from 100 to 512, and the accuracy is obtained using the kNN classifier function.

Calculation of channel-wise predicted vectors

The six selected feature vectors were input to a standard distance-based kNN classifier [50] to calculate the corresponding predicted vectors. The parameter settings were: k,1; distance, L1-norm; voting, no; validation and tenfold cross-validation (CV).

Step 6: Classify the selected six feature vectors using the 1NN classifier (k = 1) with a tenfold CV.

where \(p\) represents the predicted vector; and \(\delta (.)\), the kNN classifier function.

Calculation of channel-wise voted prediction vectors

IHMV (Dogan et al. 2021) can potentially generate better results in systems that give rise to multiple results, such as our model, which produced six predicted vectors per channel. IHMV calculated qualified indexes for the predicted vectors, sorted in descending order. Then, the predicted vectors were iteratively (loop range 3 to 6) voted on by deploying the mode function, which generated additional voted vectors.

$$ac_=\Theta (_,y)$$

(28)

$$_=\omega \left(__, \dots ,_\right), r\in \,\dots ,np\}$$

(30)

where \(\Theta (.)\) represents the accuracy calculation function; \(\xi (.)\), the sorting function; \(id\), are sorted indexes; \(\omega (.)\), the mode function; \(np\), the number of predicted vectors; and \(vp\), voted prediction vector, of which four were created from the six predicted vectors generated per channel.

Step 7: Apply IHMV to the six predicted vectors to create four voted prediction vectors.

Calculation of best channel-wise result

From among the ten prediction vectors per channel (six calculated by the kNN classifier; four voted by IHMV), the greedy algorithm was applied to calculate, one at a time, the best channel-wise results for 59 channels.

Step 8: Apply a greedy algorithm to select the best channel-wise result.

$$ac_=\Theta (_,y)$$

(31)

$$ac_=\Theta \left(_,y\right), g\in \,\mathrm\}$$

(32)

where \(x\) represents the index of the most accurate prediction vector and \(cp\), the channel-wise prediction vector;

Step 9: Repeat steps 1 to 8 until the best channel-wise results are calculated for all channels.

$$cp_ = \left\ , x \le 6} \\ ,x > 6} \\ \end } \right., \quad a \in \left\ \right\}$$

(34)

where \(nc\) represents the number of channels, i.e., 59.

Calculation of the overall best result layer

After calculating the results of all channels, the IHMV and greedy algorithm were again applied to these results to iteratively (loop range 3 to 59) generate the overall best result for the 59-channel EEG record.

Step 10: Apply IHMV to all 59 channel-wise results to generate an additional 57 (= 59–3 + 1) voted prediction vectors.

Step 11: Select the most accurate predicted vector among the 116 (= 59 + 57) predicted vectors by deploying the greedy algorithm.

留言 (0)

沒有登入
gif