Risk analysis of noise-induced hearing loss of workers in the automobile manufacturing industries based on back-propagation neural network model: a cross-sectional study in Han Chinese population

Introduction

Noise-induced hearing loss (NIHL) is a kind of progressive sensory and neural hearing impairment caused by long-term exposure to noise in the workplace.1 Noise has been recognised as a non-specific source of stress that is widespread in the human work environment, and it causes acute and persistent harm to millions of people around the world every day.2 In 2019, WHO estimated that the number of people suffering from hearing loss worldwide was 466 million. It is estimated that by 2050, more than 900 million people will suffer from hearing loss.3 After 2011, about 22 million workers were exposed to potential noise hazards each year and 4 million people worked in hazardous noise environments every day in the USA. Noise has become one of the most common occupational hazards in the USA;4 30% of the workforce exposed to serious noise exposure hazards in Europe. The prevalence of NIHL continues to grow, especially in the military.5 Industries with highest risk of NIHL are sequentially construction, manufacturing, mining, etc.6 Among workers exposed to noise, hearing loss is more common in men than in women, and the risk of hearing loss increases with age.7

Sensorineural hearing loss is related to many factors, including noise exposure levels,8 noise exposure time,8 noise properties,9 interaction with other factors (high temperature, organic solvents, etc.),10 11 individual health-related behaviours (personal protective measures, smoking, drinking, etc.),12 individual sensitivity,13 individual health status (hypertension, hyperlipidaemia, etc.)14 and so on. Although more and more research is using traditional statistical methods to assess the risk of NIHL in workers by their noise years, intensity of noise exposure and other environmental hazards. Currently, we need to develop a more precise prediction method for complex relationship between NIHL and many variables. Artificial neural networks (ANNs) is one of the most outstanding representatives of brain activity mimicry system. It simulates the structure and behaviour of human brain in mathematical form based on the understanding of the structure and working mode of human brain nerve. It has strong learning and memory capabilities, and can integrate and analyse a large amount of data to solve practical problems. It has been successfully applied in many fields, especially in the field of epidemiology. The use of ANNs for prediction is characterised by parallel processing of information, which is the fault tolerance of the network makes the input data contain some missing and error information and does not affect the function of the entire network. This is very useful in disease risk prediction and early warning studies. In addition, its most important feature is the ability to determine the complex relationships between different variables and the ability to learn.15 This model can more clearly analyse the interaction between health risk factors in non-linear analysis than traditional statistical methods. It can also replace these statistical methods by learning from a group of similar subjects to predict specific medical outcomes.16 Back-propagation neural network (BPNN) is the most typical multilayer feed forward network model trained by error inverse propagation algorithm in ANNs. This model is frequently conducted on predicting complexities in hearing and other occupational exposures.17 18 In this study, we use BPNN to provide an empirical model for predicting the occurrence of hearing loss in workers exposed to noise.

MethodsStudy subjects

In 2019, we conducted a cross-sectional study of 3600 workers exposed to noise in Guangzhou automobile manufacturing factories from March to August in China. The study was approved by the Ethics Committee of the 12th People’s Hospital of Guangzhou, and all subjects had provided informed consent. Using the random number table, 3600 subjects out of 23 599 workers from three automobile manufacturers in Guangzhou were included in the study. The remaining subjects screened according to the inclusion exclusion criteria were 3266.

The inclusion criteria for the subjects were as follows:

Cumulative time of occupational noise exposure (noise exposure time≥8 hour/day or 40 hour/week, noise intensity≥80 dB (A)) (1) >1 year; (2) male and Han and (3) age: 18–45 year old.

The exclusion criteria were as follows:

Exposure to (1) explosives or head injuries within 1 month prior to physical examination; (2) family history of hearing loss; (3) otitis or other otological diseases; (4) fever or common infections (influenza, diarrhoea and hepatitis, etc); (5) history of taking ototoxic drugs and (6) participants with bone conduction audiometry suggestive of conductive deafness.

The excluded data included missing data (n=32), insufficient occupational noise exposure accumulated time (n=16), women (n=12), non-Han ethnic group (n=5) and non-compliant hearing loss characteristics (speech frequency hearing loss greater than high-frequency hearing loss, conductive deafness; n=22), non-noise exposure sectors (administrative, financial and sales; n=247). Finally 3266 workers exposed to noise were included after screening on the basis of inclusion and exclusion criteria. The sample size, calculated based on the prevalence (28.82%) of NIHL in the previous study,19 was increased by 50%. The calculation formula based on the cross-sectional survey sample size is:

Embedded ImageEmbedded Image 1

where P is the expected prevalence. Z1-α/2 is the percentile corresponding to the area of 1-α/2 under the standard normal distribution. When α=0.05, Z1-α/2 is 1.96. d is the allowable error (accuracy), which can take the expected prevalence of 10%. The minimum number of participants required for this study was 1482.

Patient and public involvement

No patients were involved.

Physical examination and epidemiological investigation

The physical examination was performed by occupational health examiners in accordance with the standard protocol for each participant. Height, weight, blood pressure, blood lipids, pure tone audiometry were measured and we also inquired about the contact situation of other occupational hazards. According to the Chinese Diagnostic Criteria of Occupational NIHL (GBZ49-2014), hearing thresholds of both ears were determined with the ascending method in 5 dB steps at frequencies of 500, 1000, 2000, 3000, 4000 and 6000 Hz. Age and gender were adjusted according to GB/T7582-2004. The hearing threshold at high frequency by PTA was defined as the average at 3000, 4000 and 6000 Hz for each ear. In this study, workers exposed to noise at binaural high-frequency (3000, 4000 and 6000 Hz) hearing thresholds greater than 25 dB (A) was defined as NIHL, and workers with hearing thresholds≤25 dB (A) in any binaural frequency band was defined as normal hearing.

A noise statistical analyzer (AWA5610P; Westernisation Instrument Technology Co., Beijing, China) was used to evaluate the intensity of noise in the working environment. Noise exposure was evaluated with A-weighted energy equivalent continuous sound pressure level (Lex.8 h) according to the National Criteria of Measurement of Noise in the Workplace (GBZ/T189.8-2007) (China, 2007). Cumulative noise exposure (CNE) was calculated as

Embedded ImageEmbedded Image 2

where T means years of noise exposure.

The questionnaire items included general conditions, professional history, personal history, history and individual symptoms. The general situation included the medical examination number, gender, ethnicity, date of birth, education level, marital status and personal monthly income status. Career history included noise exposure years and wearing conditions of noise protection equipments. Personal history included smoking frequency, number of years of smoking, drinking frequency, number of years of drinking, music listening or videos watching with headphones, phone call time per day and noise exposure after work. History was primarily used to rule out other factors that may affect hearing function, including history of head trauma, exposure to explosive operations, ear disease, long-term use of ototoxic drugs and history of infectious diseases. The survey was conducted by professionally trained investigators. They collected information by conducting face-to-face surveys and inquiries for each subject using a questionnaire.

Structure of neural network model

First, the subjects were randomly divided into training and testing groups at a ratio of 85:15. Among them, the summary information of 2776 noise workers was randomly selected as the training set and the BPNN prediction model was established. Other workers were used as test groups to test and evaluate the performance of the model. Second, we have classified continuous variables to facilitate factor identification. We used binomial logistic regression univariate analysis to identify significant predictors of NIHL. Third, we used the influencing factors with p values less than 0.1 as predictors of NIHL to build BPNN model.

BPNN consists of three layers: the input layer for receiving information, the hidden layer for processing information and the output layer for calculating results. BPNN was a risk operation that uses predictive factors as input variables and NIHL as output variables. The number of neurons in the input and output layers corresponded to the number of important predictors and output variables, respectively. The number of neurons in the hidden layer uses trial and error method to train the data so as to optimise the number of neurons and improve the accuracy. The model uses a unipolar sigmoid function:

Embedded ImageEmbedded Image 3

The sigmoid function is a non-linear function of neurons, where e is the base of the natural logarithm. Its output values are all between 0 and 1. While x tends to positive infinity, f(x) tends to 1. While x tends to negative infinity, f(x) tends to 0.

The purpose of our sigmoid function is to introduce a non-linear function into the model. If there is no activation function, then no matter how many layers your neural network has, it will eventually be a linear mapping. The network weight adjustment was performed based on the gradient descent method, where x was an input variable. All data were normalised to a range of 0–1. For binary variables, 0 means ‘no’ and 1 means ‘yes’.

Logistic regression modelling

The logistic regression model was a generalised linear regression model that could be used to predict the occurrence of NIHL. The logistic regression model formula is:

Embedded ImageEmbedded Image 4

where β0 is the constant term, β1, β2, …, βp is the regression coefficient of logistic model. The response variable of the two classifications was denoted by Y. As with the BPNN model, the training set was applied to construct a logistic regression model. Predictive factors were used as initial input variables to test and evaluate the performance of the model for logistic regression modelling. Stepwise regression method is based on the influence of each explanatory variable on the response variable to determine whether to eliminate the variable. Stepwise regression model is based on Akaike information (AIC) criterion. On the basis of the whole model, the step function in R language and the AIC information criterion are used to eliminate 10 unimportant characteristic variables and retain 16 characteristic variables. Least absolute shrinkage and selection operator (LASSO) is a variable selection method proposed by statistician Tibshirani.20 Compared with traditional regression methods, LASSO regression can deal with a larger number of potential predictors, picking out the variables most associated with disease. Based on this, LASSO has been used by many researchers for the screening of disease risk factors and the establishment of prediction models.21 The essence of lasso-logistic model is to add penalty function to the sum of squares of residuals for variable selection. The value of harmonic parameter λ directly determines the variable selection of lasso-logistic regression. Generalised cross-validation selection penalty parameter λ (lambda) was used in this study. On the basis of the full-variable lasso-logistic model, five unimportant characteristic variables were eliminated and 21 characteristic variables were retained.

Statistical analysis

The data were analysed using R V.3.5.3 and the ‘glmnet’ ‘nnet’ ‘ROCR’ packages were used for statistical analyses. The test level was set to p<0.05. The categorical variables were expressed in frequency (%) and analysed using a χ2 independence test. Quantitative variables were expressed as mean±SD (M±SD) and analysed using independent samples t tests. Predictors were screened using binomial logistic regression for the included subjects. We also used lasso regression and cross-validation methods to select the most valuable variables and construct lasso-logistic model. The probability split value takes the maximum corresponding cut-off value of the Youden Index. Finally, we evaluated the performance of the model by calculating sensitivity (TPR), specificity (TNR), accuracy (ACC), positive predictive value (PPV), negative predictive value (NPV) and area under the ROC curve (AUC) of the receiver operating characteristics (ROC). χ2 independence test was used to evaluate the performance difference between BPNN model and logistic regression model. R version and the ‘ROCR’ packages was applied to analyse the ROC curves of models.

ResultsBasic characteristics of subjects

A total of 3600 workers exposed to noise were surveyed in this study. Among them, 3568 valid questionnaires were collected (the effective rate was 99%). We finally included 3266 workers exposed to noise after screening based on inclusion and exclusion criteria. We conducted a comparative analysis of the hearing loss detection rate in the 3266 noise workers. There were significant differences in NIHL rates at different ages, marriages, personal monthly income, noise working years, wearing noise protection products, smoking frequency and years, drinking frequency and years, listening to music on headphones/watching videos, call time per day, auditory system symptoms, exposure to organic solvents, exposure to high temperatures, exposure to welding fumes, BMI (Body Mass Index), total cholesterol and triglyceride abnormalities and CNE (p<0.05) (see table 1).

Table 1

Comparison of hearing loss rates of workers with different individual characteristics

Related NIHL risk

The 26 variables listed in table 1 were analysed by one-way logistic regression. Finally, 20 predictors were selected as the risk of NIHL. The following 19 factors were associated with increased NIHL: age, marriage, personal monthly income, noise exposure time, frequency of smoking, smoking years, frequency of drinking, drinking years, listening to music on headphones/watching videos, call time per day, presence or absence of auditory system symptoms, exposure to organic solvents, high temperature exposure, welding fumes, high blood pressure, BMI, total cholesterol, triglycerides and CNE. The occurrence of noise-induced hearing loss was inversely related to the wearing of noise protection products (see online supplemental table S1 and figure S1).

BPNN model

BPNN model was based on the significant prediction factors of noise hearing loss rate. The input variables were the 20 important predictors mentioned in table 2, and the output variables were the two-category variables of the individual’s NIHL. We have previously compared the characteristics of the training and test data sets (table 2). These two data sets did not differ significantly for any of the 20 variables (p>0.05). This means that the two data sets are balanced in clinical characteristic distribution. The BPNN model includes an input layer, a hidden layer and an output layer. The input and output layers include 20 and 1 neurons, respectively, corresponding to the number of predictors and output variables. From the training BPNN model, we have set the number of hidden layers to 21 (specified as the number of hidden layer neurons in each layer in the form of a vector). Step max (set the maximum number of steps in the neural network training, the neural network training process stops when this maximum is reached) is set to: 10 000. Learning rate (for specifying the learning rate used by traditional back propagation): 0.01. We used a 3-layer BPNN model to select the input, hidden and output layers of 20, 21 and 1 neuron as the best predictive models (see online supplemental figure S2). In the training BPNN model, we found that the CNE, auditory system symptoms, age, listening to music or watching video with headphones, exposure to high temperature and noise exposure time were relatively important factors for NIHL (see online supplemental figure S3). When the BPNN model was applied to the test set, it was found to have a sensitivity of 83.33%, a specificity of 85.92%, an accuracy of 85.51%, a PPV of 52.85%, a negative predictive value of 96.46% and area under the receiver operating curve (AUC) : 0.926(table 3).

Table 2

Input variables and their descriptive statistics

Table 3

Comparison of the BPNN model and univariate-logistic regression model for predicting NIHL in test set

Comparison between BPNN model and logistic regression model

The comparative analysis of the evaluation indices of BPNN model and univariate-logistic regression model is shown in table 3. The input variables of BPNN model were 16 important factors screened by stepwise-logistic regression in online supplemental table S2, and the output variables were two types of variables for individual NIHL. We also compared the features of the training and test data sets. There was no significant difference between the two data sets for any of the 16 variables (p>0.05). This means that the two data sets were also balanced in the distribution of clinical features. The comparative analysis of the evaluation indices of BPNN model and stepwise-logistic regression model is shown in table 4. The input variables of BPNN model were 21 important factors screened by lasso logistic regression in online supplemental table S3. Similarly, none of the 21 variables showed significant differences between the two data sets (p>0.05). The comparative analysis of the evaluation indices of BPNN model and lasso-logistic regression model is shown in table 5. There were significant differences in sensitivity, specificity, accuracy, PPV, negative predictive value and AUC of the ROCs (p<0.05). The ROC curves for the three models are shown in online supplemental figures S4–S6.

Table 4

Comparison of the BPNN model and stepwise-logistic regression model for predicting NIHL in test set

Table 5

Comparison of the BPNN model and lasso-logistic regression model for predicting NIHL in test set

Discussion

NIHL has been ubiquitous all over the world. In 2010, China’s automobile manufacturing production had become the top of the world.22 It was reported that hearing loss due to exposure to noise, in relation to mechanical systems and work processes, is one of the most common impairs in the automotive industries.23 In China, workers in the automotive industries usually work at least 8 hours a day and rest only 1 day per week. The high intensity of noise generated during die casting, stamping and welding can greatly increase the risk of hearing problems within workers. However, most workers do not have standardised noise protection products, and the production process and equipment are not effective in controlling and eliminating noise. Workers’ awareness of noise hazards, inadequate policy implementation and inadequate management will increase the risk of workers’ NIHL. Some researchers had investigated that there were about 11.6%–38.9% workers exposing to noise in an automobile manufacturing industry in China.24 25 Obviously, these studies had shown that the occupational workers in automobile manufacturing industry is at high risk of developing NIHL under hazardous level of dangerous noise. Exposure to a certain intensity of noise for a long time both adversely affects the auditory system, and also affects the non-auditory system. The risk of developing NIHL depended mainly on the combination of the duration of noise exposure and the sound level.26 We calculated CNE according to current international noise exposure standards (ISO-1999, 2013) that rely on energy indicators.27 It was assumed that the effect of noise exposure on hearing is proportional to the duration of exposure multiplied by the intensity of energy exposed. In this study, we also investigated important factors related to the occurrence of NIHL, including age,28 marriage,29 personal monthly income,30 noise exposure time,31 wearing noise protection products,32 smoking situation,33 drinking situation,34 listening to music with headphones/watching video,35 calling time per day,36 presence or absence of auditory system symptoms,37 exposure to organic solvents,38 high temperature exposure,39 welding fume exposure,40 hypertension,41 BMI,42 total cholesterol43 and triglycerides.44 We incorporated more variables to improve the predictive power of the model. Early identification of NIHL should be performed in the initial stages of hearing loss, to help identifying patients with shorter working years. The statistical prediction method commonly used in previous studies is the logistic regression model. The probability of predicting the outcome is related to a range of potential input factors. Logistic regression model needs to preprocess the data in the early stage, and it has some limitations on studying complex non-linear relationship between independent variables and dependent variables. The BPNN model was relatively simple, time-saving and labor-saving for data preprocessing.

From the above results, it can be seen that the BPNN model is better for predicting NIHL. The BPNN model has better predictive performance against NIHL than the logistic regression model in terms of sensitivity (TPR), specificity (TNR), accuracy (ACC), positive predictive value (PPV) and negative predictive value (NPV). The BPNN model recognises that the area under the ROC curve of NIHL is also higher than that of the logistic regression model. We also optimised and improved the logistic regression based on lasso theory to establish the lasso-logistic model with higher prediction accuracy and faster running speed. It also shows that the prediction of hearing loss by the BPNN model is more accurate. In the BPNN model, predictive factors can be non-linearly changed at the hidden layer nodes and at the output nodes to construct complex nonlinear relationships. According to the literature, the BPNN model can provide a tighter fitting effect when there are complex non-linear relationships in the input variables.45 The study have found that the relatively important factors leading to NIHL includes CNE, auditory symptoms, age, contact with high temperature, music listening/videos watching with headphones and noise exposure time. Our results are consistent with Kim et al.8 46 47 The risk of NIHL increases with CNE and age, but there are many approaches to reduce the prevalence of hearing loss, such as comprehensive management on encouraging the use of noise protection products and reducing headphone wearing duration. Companies should improve the production process. They should control and eliminate noise sources as much as possible, and increase the mechanisation and automation of production so that workers are kept away from heat.

We presented BPNN model, univariate-logistic regression model, stepwise-logistic regression model and lasso-logistic regression model to assess the risk of NIHL. The limitation of our study was the lack of longitudinal research and single-industry participants to verify the model. Our model is based on the characteristics of historical data, and the previous research was mainly limited by manpower and resources. Our research team has established occupational population health cohort, through which longitudinal study data and different industries participant will be added to further verify the application performance of the model in the population. In the follow-up research, other machine learning models can be added to carry out comprehensive comparative analysis of several models. We will use genetic research as part of predictive factors, adding cohort studies to obtain more valuable information as input variables into our neural network model. We can also optimise the learning algorithm to improve the predictive ability of the model.

留言 (0)

沒有登入
gif