Appropriate hand washing is crucial to prevent the transmission of bacteria, viruses and parasites [1, 2] and reducing the rate of infections [3, 4]. In particular, during the period of the COVID-19 pandemic, the importance of hand washing has been emphasised [5]. There are two important aspects of hand washing: the habit (when to wash hands) and the techniques (how to wash hands). Ataee et al. mentioned ten occasions when hand washing is necessary [6]. However, there are still cultural differences in the hand washing habit. For example, in an international survey among 63 countries in 2015, it appeared that around 77% of people did not wash their hands after using the toilet, and even in the developed countries, the rate was high, such as 50% in Netherlands and 43% in Italy [7]. Regarding the hand washing quality, the World Health Organization (WHO) proposes the regulation of hand washing techniques with six major steps (steps 2–7) as shown in Figure 1 whereby hand washing lasts for 40–60 s [8]. Thus, the proper hand washing techniques must be learned, along with forming a good hand washing habit. To monitor these, sensor technology and machine learning algorithms can be used to identify different classes of the hand washing techniques proposed by the WHO and detect hand washing among other daily living activities (ADLs).
WHO recommended hand washing steps [8]Multiple types of sensors have been investigated previously, such as internet-of-things (IoT) [9, 10], audio sensor [11, 12], and camera [13-15]. Nevertheless, these sensors still have limitations since they need to be set up close to the activity area. People wash their hands in various places at home, such as the bathroom or kitchen. Hence, IoT sensors need to be installed in all these places, which increases the expense and the complexity of setup. Additionally, it is advisable to provide individuals with feedback, especially while helping them form the hand washing habit. However, these sensors cannot identify different users unless assisted with other sensors, for example, Beacon or motion sensors [16]. Despite being able to recognise people, cameras cause privacy concerns, especially when installed in the bathroom. In addition, they cannot detect hand washing steps (techniques).
As an alternative method, wearable sensors, including an accelerometer and gyroscope, can be applied. In the study of Mondol et al. [17], the Harmony system was developed to detect the hand washing quality for food workers. In this system, a smartwatch was attached to the participant's wrist to collect the movement signals, a Bluetooth transmitter was integrated into the dispenser to detect whether the participant used the soap, and a Beacon sensor was used to send alerts when the participant was in the area that required hand hygiene or when the participant did not properly wash hands. The accelerometer and gyroscope signals were processed to extract the time-domain features, and a decision tree (DT) was utilised to detect classes of hand washing. The classifier was trained in the user-independent mode. In this model, all the participants are split into two groups, the data of one group is used for training, and the other group's data is used for testing. As a result, the DT classifier received an average accuracy of 85%. In the later study [18], researchers explored the impact of the sensor position, and they tried to detect hand washing activity among other ADLs using the Feed Forward Neural Network (FFN). With the F1-scores of 0.72 and 0.74 for the left and right wrist, respectively, there was no significant difference between which sides the wearable device was worn. Because each participant only wore the smartwatch on one side, the result may be impacted by the different performances of participants. The WISDM dataset [19] was added without including hand washing to test whether other ADLs are misclassified as hand washing. The authors achieved the result by reducing the false positive rate by 77% compared to the baseline method [20]. However, they did not discuss which ADLs are more likely to be classified as hand washing.
In study [21], also one wristwatch was used, including the accelerometer and gyroscope sensors. The data were collected both in a controlled and uncontrolled environment. A hidden Markov model (HMM) classifier was used with an accuracy of 85% with firstly determining the order of hand washing classes in the user-independent mode. Compared to study [17, 18], the before (rubbing hands) and after (turning off the faucet) hand washing activities were included. These two classes were detected using an extra binary classifier, with an accuracy ranging from 35% to 54%. Rather than attaching sensors to the wrist, Wang et al. used four armbands which also included Electromyography (EMG) sensors [22]. Two armbands were placed on the forearms and two on the upper arms. This study investigated two types of hand hygiene, alcohol-based hand rubbing and hand washing with soap and water. During data acquisition, each type was performed 15 times/participant. The after hand washing classes were categorised into three classes: rinsing hands, drying hands with a towel, and turning off the faucet. EXtreme Gradient Boosting (XGBoost) combined with E.Divisive [23] was applied. Furthermore, Wang et al. discussed the impact of different sensors and the attached position. The highest accuracy was achieved using the signals from all three sensors (accelerometer, gyroscope, and EMG) from both armbands attached to the forearm for 91% in the user-independent mode. The classifier was also trained in the user-dependent mode: one part of the dataset of a participant used for training and the remaining part of the same participant used for testing. The number of sessions used for training the model was examined in the user-dependent mode, which indicated that after using the data of 16 sessions, the classification accuracy could reach 90%. In study [16], an online monitoring system including a smartwatch was developed with the Convolutional (CNN) combined with Long Short Term Memory (LSTM) neural network as the classifier. The accuracy of each step was over 90% in the real-life dataset.
Previous studies have proven the feasibility of using wearable sensors, especially accelerometers and gyroscopes, to monitor the hand washing techniques suggested by the WHO and determine the duration of hand washing. However, they did not detect all the steps covered in the hand washing techniques, for example, getting soap. In addition, few studies [18] looked at detecting hand washing among other ADLs. At last, the impact of sensors and the wearing side was examined based on the average classification performance rather than for each hand washing step. Therefore, this study proposes a wireless system with wearable devices to monitor hand washing. The contributions of this study are: Classifying hand washing steps in the WHO regulation. Classifying hand washing activity among other ADLs Differentiating hand washing activities between not following (untrained) and following (trained) the WHO regulation. Investigating the impact of the combination of sensors on the classification of each hand washing step.To accomplish this, two wearable sensors composed of three-axis accelerometers and gyroscopes were attached to each wrist. Features are extracted from both the time and frequency domain, and multiple machine learning algorithms are explored in the leave-one-subject-out (LOSO) mode.
2 METHODOLOGY 2.1 The wearable deviceAs shown in Figure 2, two Byteflies sensors [24] were used, one on each wrist. The sensor includes a three-axis accelerometer and a three-axis gyroscope, both running at 100 Hz. The placement direction on each wrist was fixed. The sensors were automatically synchronised. Two radar sensors were also applied to detect the hand washing movement during the experiment, but their received signals will not be discussed in this paper.
Setting up of the experiment
2.2 Dataset preparation 2.2.1 Hand washing data collectionThis study involved ten individuals (seven males, three females, aged 20–23). Participants washed their hands with the WHO regulation poster placed in front of them and instructed by a researcher. Each participant performed hand washing three times, each time spending 60–80 s. One camera was placed aside for recording as the ground-truth annotation. Eleven activities (classes a–k) were labelled, as listed in Table 1. The classes were categorised into three groups: before hand washing (steps 0, 1, and 2), hand washing (steps 3–11), after hand washing (steps 0 and 1). Class a covers three hand washing steps: opening the faucet, wetting/ rinsing hands, and turning off the faucet. These activities were performed both before and after hand washing. Therefore, they were categorised into one class. Step 9, drying hands, was not included. Considering real application, people usually first turn off the faucet and then get the towel to dry their hands. Thus, hand drying was not included in this study.
TABLE 1. Annotated hand washing classes Class Detail (Related steps proposed by the WHO as shown in Figure 1) a 0, 8, 10 opening the faucet, wetting/ rinsing hands and turning off the faucet b 1, getting the soap c 2, rubbing hands palm to palm d 3, right palm over left dorsum with interlaced fingers e 3, left palm over right dorsum with interlaced fingers f 4, palm to palm with interlaced fingers g 5, backs of fingers to opposing palms with interlaced fingers h 6, rubbing left thumb i 6, rubbing right thumb j 7, clasped fingers of right hand in left palm k 7, clasped fingers of left hand in right palm 2.2.2 ADLs data collection To test the system's ability to classify hand washing among other ADLs, we also conducted another experiment that included eight ADLs: sitting, standing, walking, computer typing, going upstairs, and going downstairs, brushing teeth, and untrained hand washing. These activities were included because: These activities are basic ADLs and are normally performed every day. Brushing teeth is also one type of bathroom activity and it also includes high-frequency movements of the wrists. Eight participants joined the experiment (three females and five males, aged 23–31). The right wrist was the dominant wrist of all participants. Among the eight participants, two people joined the experiment of hand washing dataset collection as well. Same as the experiment of hand washing, all participants wore two Byteflies sensors, one on each wrist. In this experiment, participants were asked to perform activities in their own way. The performed activities and procedures are as following: Sitting (3 min). Participants could talk, use smartphones, or with slight arm movements. Computer typing (3 min). Participants were asked to type the given text material. Standing (3 min). The requirement is the same as sitting. Walking (180 m). Participants were asked to walk at their normal speed. Going upstairs (60 stairs ×2 times). Participants were asked to go at their normal speed. Going downstairs (60 stairs ×2 times). Participants were asked to go at their normal speed. Hand washing (3 times). The time was recorded from participants opening the faucet to rinsing hands and closing the faucet. Brushing teeth. The time was recorded from participants squirting the toothpaste on the toothbrush to rinsing the mouth. All the participants used manual toothbrushes.One researcher was by aside to annotate activities. The experiments were approved by the KU LEUVEN Social and Societal Ethics Committee (SMEC), with the assigned serial number G-2020-2705. All participants signed the informed consent form before participating in the experiment.
2.3 Data analysis 2.3.1 Data pre-processingThe collected accelerometer and gyroscope signals were segmented based on the annotation of the video recording. Afterwards, segmented signals were then sliced by a fixed-length window. In the previous studies, the applied windows size ranged from 0.06 s (three data points) [21] to 1 s (50 data points) [17], with the overlap rate in the range of (50%, 70%). In this study, the testing window size was varied between (0.08 s, 3.5 s), with an overlap rate of 0/50%. A Support Vector Machine (SVM) classifier was used as the classifier, trained in the user-independent mode. The result of optimal window size is shown in Section 3.
2.3.2 Feature extractionThe features were extracted from each axis and the magnitude of the three axes. In the time domain, the extracted features were the mean, standard deviation, root mean square (RMS), minimum (min), maximum (max), range (= max–min), interquartile range, skewness, kurtosis, mean-crossing times (= the number of the time the signal value passing through the mean value), and energy. In addition, the correlation between every two axes and zero-crossing times (= the number of the time the signal value passing through zero) were calculated from each axis, except for the magnitude signal. In the frequency domain, the frequency power, the frequency at the highest amplitude, 25% frequency (F25d), 75% frequency (F75d), Normalised frequency (1−F25d/F75d) and spectral energy were extracted. As a result, 74 features were extracted from each sensor, with 296 features extracted from all sensors.
2.3.3 Classification model Four types of machine learning (ML) models were investigated as they were applied in the previous works: SVM [10, 22], K-Nearest Neighborhood (KNN) [16], XGBoost [22], and FFN [18]. The SVM model applied the Gaussian kernel and was trained in the one-versus-one mode. Hence, 55 binary classification models were trained, and the predicted class was assigned to the one with the highest score. The hyper-parameters of the models were tuned using the grid-search method with testing all the combinations [25]. The searching range and the tuning result are shown in Table 2. While developing the FFN architecture, one hidden layer was added each time, and the number of neurons in this layer was tuned. We stopped adding the layer until the classification performance dropped on the cross-validation set. Moreover, the searching range of the number of neurons of the new hidden layer did not exceed that of the previous one. All the models were trained in the LOSO mode: in each fold, one participant's data were selected for testing, one participant's data for validation, and the remaining eight participants' data were for training. In this study, the models were trained in ten folds. The F1-score was applied as the evaluation metric. The class weight balancing method [26, 27] was used to solve the imbalanced dataset problem by assigning the class weight to be the inverse of the number of windows. The model with the highest performance was used in the following cases: Investigating the optimal combination of the sensors. Various combinations of sensors were tested, including only the accelerometer sensor from one side, both accelerometer and gyroscope sensors on one side, to all sensors. Detecting the hand washing activity among other ADLs. The hand washing dataset was combined with the ADLs dataset. As aforementioned, the ADLs dataset did not include hand washing activity that followed the WHO regulation. Hence, half of the participants' data were used for training in each dataset and the remaining for testing. In other words, the data of five participants in the hand washing dataset and four participants in the ADLs dataset were selected as the training dataset. The data of the remaining participants were selected as the testing dataset. Differentiate trained and untrained hand washing activities. While combining the hand washing dataset and ADLs dataset, the hand washing activities in these two datasets were categorised into two classes. TABLE 2. Hyper-parameters tuning results of classifiers in the user-independent mode Classifier Hyper-parameter Searching range Tuned value SVM C - penalty parameter 2n, n∈(−5,15) [28] 25 γ 2n,n∈(−15,5) 2−7 KNN K 1−15 3 XGBoost maximum depth of decision trees 1−10 2 minimum child weight 1−5 4 γ 2n,n∈(−5,5) 25 FFN Number of hidden layers 1−5 3 Number of neurons 2n,n∈(1,8) 128 - layer 1 128 - layer 2 64 - layer 3 4 DISCUSSION 4.1 Compared with previous studiesTable 5 summarises the accuracy results of every hand washing step of the proposed method and previous studies. As some previous studies [17, 21] only presented their results in figures, their results are not listed here. Study [22] and study [21] included hand washing steps 8–10: rinsing hands, drying hands, and turning off the faucet. However, we categorised these three steps into one class (class a). Hence, to comparison, the results of steps 8 and 9 of study [21, 22] are listed together in Table 5. Study [21] and study [20] separated class f into two classes, left and right, so there are two values in that column. As a result, the proposed method is comparable to the other studies, especially for classifying classes d, e, f, and k. However, the results of classes h, i, and j for the proposed method are lower than those of other studies. The reason could be that the number of sensors used in the proposed method is less than study [22], or the dataset size is smaller.
TABLE 5. The comparison of the accuracy result of the proposed method with previous study Activities a f g Drying Study Sensor Classifier Rinsing hands Closing the faucet b c d e Left Right Left Right h i j k hands Proposed method accelerometer+gyroscope SVM 84% 71% 93% 97% 93% 99% 95% 76% 74% 77% 94% - [22] accelerometer+gyroscope+EMG XGBoost+E.Divisive 100% 100% - 83% 83% 99% 87% 86% 84% 84% 94% 94% 100% [21] accelerometer+gyroscope HMM 80% 75% 97% 82% 85% 93% 77% 93% 86% 75% 87% 89% 87% [20] accelerometer+gyroscope KNN - - - 93% 91% 84% 93% 98% 94% 87% 88% 84% 84% -In our study, the optimal window size is 256 data points with no overlap, which is not in the range proposed by other studies that were 3-50 data points with 50–75% overlap. This shows that the optimal window size depends on the types and number of activity classes. For instance, in study [22], the average duration of turning off the faucet was around 0.5 s, so it was reasonable to select the 0.2 s window with 75% overlap, and it can increase the detection precision in real-life application. However turning off the faucet is incorporated with turning on the faucet and rinsing hands as in one class in our study. For the overlap rate, although a higher overlap rate can enlarge dataset size, it can also increase the biased result with similar windows tested.
Compared to previous studies, our study has the advantages that: We include getting the soap as one class, even though the accuracy is 71%. Since getting soap was detected by the Bluetooth sensors integrated into the dispenser in the previous study [17], our result from wearable sensors suggests that the IoT sensors in the dispenser can be removed, which would simplify the installation and reduce the cost. In addition, this study categorises opening the faucet, rinsing hands, and turning off the faucet into one class. This is because the period of each action is short, and they are typically performed sequentially. We prove the feasibility of recognising WHO recommended hand washing activity among other ADLs, using two wearable devices. Hence, the next step is to detect occasions in which hand hygiene is necessary, which can help people form good hand washing habits via sending alarm signals when they forget to wash their hands, such as after going to the toilet. We analyse the difference between trained hand washing activity from the untrained one. This analysis result can be used to develop the training system to help people learn how to wash their hands properly. In our study, we do not consider the order of the hand washing steps. Thus, we used the SVM model. Moreover, the classification results of six classes are higher than the ones of study [21] using HMM model, which implies that the order of hand washing steps is not essential to classification. 4.2 Limitations and future work Although SVM achieved 0.8501 for the average F1-score in classifying hand washing steps, the results of getting the soap (class b) and rubbing thumbs (classes h and i) are much lower than the average score, less than 0.7. Thus, the future study will focus on increasing the classification performance of these steps. The possible solution can be including other sensors, for instance, the pressure sensor/motion sensors for detecting getting the soap and EMG sensors for detecting the movement of fingers. In this study, we only considered one method of performing brushing teeth, which is using a manual toothbrush. The future work can also include samples of using an electric toothbrush. In this study, sensors attached to both wrists achieved the highest results for both classification of hand washing techniques and detecting hand washing activity among other ADLs. However, we mainly focused on the participants with the right wrist as the dominant side. The future study will also take participants with the left wrist as the dominant side into consideration. Byteflies sensors were used in our study, but they may not be convenient for users. Because the sensors were attached to the body via patches, which was not reusable. In the future, this sensor can be replaced with other similar devices, such as the smartwatch that also includes both the accelerometer and gyroscope sensors. 5 CONCLUSIONThis study demonstrates the feasibility of a system using two Byteflies sensors and an SVM classifier to classify 11 hand washing steps. The result suggests that the optimal combination of the devices is two devices integrated with an accelerometer and a gyroscope, one device on each wrist. The hand washing activity can also be classified among other ADLs and untrained hand washing activity. The system will be improved to serve as a training system for individuals who need to form a good hand washing habit.
CONFLICT OF INTERESTThe authors declare no conflict of interest.
ETHICAL APPROVALThe experiments were approved by the KU LEUVEN Social and Societal Ethics Committee (SMEC), with the assigned serial number G-2020-2705. All participants signed the informed consent form before participating in the experiment.
留言 (0)