The health data for this study are derived from the China Health and Retirement Longitudinal Study (CHARLS), a survey organized by the National Development Research Institute of Peking University (NDRI). CHARLS focuses on the middle-aged and elderly population aged 45 years and above in China, which uses a four-stage random sampling method at the county-village-household-individual level to obtain samples. The survey covers a large sample size of more than 17,000 people in 150 county-level units and 450 village-level units in 28 provinces in China. This data contains multi-dimensional information such as personal health status, family situation, income and expenditure, and work or retirement of the research object, which is an important source of data for scientific research on aging. Informed consent was obtained from all respondents. Each round of CHARLS was approved by the Biomedical Ethics Review Board of Peking University (No. IRB00001052-11015). Considering the principles of data availability, consistency, and completeness, this study uses four periods of follow-up data from 2013, 2015, 2018, and 2020, and deleted samples under the age of 45 and those with missing values in key variables. The data at the prefecture-level city level are derived from the China Statistical Yearbook, China City Statistical Yearbook, National Economic and Social Development Statistical Bulletin, China Environmental Statistical Yearbook, China Health Statistical Yearbook, and local government work reports. A few missing values in the key variables are complemented by linear interpolation or mean value method. After matching the micro data with the city-level data, we obtain balanced panel data containing 111 prefecture-level cities and 22,736 valid observations.
The Hu Huanyong line (Hu line), also known as the “Aihui-Tengchong” line (Aihui is now renamed Heihe), is the boundary of China’s population distribution proposed by Chinese geographer Hu Huanyong in the 1930s [40]. The areas on both sides of the Hu line show great differences in ecological carrying capacity, natural environment, economic development, and humanity. Therefore, in this paper, the Hu line is used as the dividing line between the eastern and western regions of China, and a dummy variable is constructed with this line as the boundary. The area east of the Hu line is assigned a value of 1, and the area west of the Hu line is assigned a value of 0 to study the regional heterogeneity of the health effects of environmental pollution.
Variable selectionDependent variable: health (Health)The existing literature mainly uses self-rated health, mental health, disease incidence, mortality, and other indicators to characterize the health status. Considering the availability and consistency of data, we refer to the existing literature [41, 42] and measure the health status of middle-aged and elderly people through the Activity of Daily Living Scale (ADL). The ADL score is obtained by adding up the scores of six basic activities: bathing, dressing, eating, toileting, getting in and out of bed, and controlling urination and defecation, with responses of “no difficulty”, “difficulty but still able to do it”, “difficulty and need help”, “unable to do it” being assigned a score of 3 − 0, respectively. ADL scores range from 0 to 18, with higher scores representing higher levels of health.
Independent variable: environmental pollution (POL)Concerning the degree of environmental pollution, some studies have comprehensively measured it from multiple dimensions such as air, solid waste, and water [43], while others have used a single pollution indicator as a proxy variable for environmental pollution [44]. In China, industrial waste emissions such as industrial waste gas and waste water are the main sources of environmental pollution [4]. In order to evaluate the degree of environmental pollution more comprehensively and avoid the deviation caused by human factors, based on the data of industrial sulfur dioxide emission, industrial smoke and dust emission, and industrial wastewater emission from the China City Statistical Yearbook, we use the entropy weight method to assign certain weights to each of the three indicators, and then calculate the environmental pollution index to characterize the level of environmental pollution in cities. The weight calculation steps of the three indicators are as follows:
Firstly, after determining that the three environmental pollution indicators are all negative indicators, we use the following formula to standardize them:
$$\begin\:} = \frac} - }}}} - min}}} \hfill \\\,\,\,\,\,\,\, + 0.0001,\:1 \leqslant \:i \leqslant \:m,\:1 \leqslant \:j \leqslant \:n \hfill \\ \end$$
\(\:_\) represents the standardized value of the index \(\:j\) in city \(\:i\) and \(\:_\) is its original value.
Secondly, after calculating the proportion of index \(\:j\) in city \(\:i\), we calculate the entropy of index j, \(\:_\).
$$\:} = \frac}}}^m }} }},\:0 \leqslant \:} \leqslant \:1$$
$$\: = - \frac}\sum\nolimits_^m }ln},\:0 \leqslant \:} \leqslant \:1}$$
Thirdly, we need to calculate the information entropy redundancy \(\:_\).
Fourthly, we finally determine the weights of the indicators \(\:_\).
Threshold variable: trade openness (OPEN)Previous studies have used the ratio of total imports and exports to GDP, that is, foreign trade dependence [45], and foreign direct investment (FDI) [46] as proxy variables for trade openness. Since FDI only reflects the inflow of foreign capital, we use a more comprehensive index of foreign trade dependence to measure the degree of trade openness. The larger the value, the greater the degree of trade openness.
Control variablesMedical level (lnMED): The city medical level affects the accessibility of medical resources to the residents and is closely related to their health [47]. Accordingly, we use the number of medical and health institutions to indicate the level of the city, which is logarithmized to reduce the effects of heteroscedasticity and non-stationarity.
Environmental regulation (ER): The frequency of environment-related words mentioned in the government work reports can reflect the importance that the local government attaches to environmental issues and the intensity of environmental regulation. Scholars generally believe that the intensity of environmental governance will directly affect the degree of environmental pollution [48]. Therefore, this study refers to Chen Z et al. [49], and retrieves the frequency of environmental-related words such as “environmental protection”, “pollution”, “energy consumption”, “emissions reduction”, “low carbon”, “ecology”, “green” and “PM2.5” in the government work reports of various cities in 2013, 2015, 2018 and 2020, and calculates the proportion of word frequency in the total number of words to characterize environmental regulation.
Industrial structure (IND): Industrial structure is a typical indicator of environmental conditions and is related to environmental quality [50]. Additionally, industrial structure may also be contributing to the development of healthcare, thereby promoting public health. So we use the ratio of tertiary industry output to secondary industry output as a proxy variable for industrial structure.
Human capital (HC): Some researchers have verified the correlation between pollution and productivity from the perspective of human capital [51], and health human capital also affects public health. So it is reasonable to use human capital as a control variable, and we select the ratio of the number of people with a general college degree or higher to the total population to measure the level of regional human capital.
Descriptive statistics for the variables are shown in Table 1.
Table 1 Descriptive statistics of variables (N = 22736)Models and methodsTwo-way fixed effects regression modelThe health production function theory proposed by Crossman regards health as both a consumer and an investment good, integrating the effects of various factors on health, including economy, healthcare, health human capital, and education [9, 43]. In order to explore the impact of environmental pollution on the health of middle-aged and elderly people, based on the theoretical framework of the health production function, we select environmental pollution as an input factor of health production and construct a baseline regression model. Since the data used in this study are balanced panel data, we use the Hansman test to select the appropriate panel data regression model among the random effects model (RE) and fixed effects model (FE). According to the Hansman test results, the fixed effect model is more effective in this study. Therefore, we construct a two-way fixed effects regression model to eliminate the influence of uncontrollable factors at the city level and time level. Previous studies exploring the health effect of a certain factor have also used the fixed effects regression models [52, 53]. The model is as follows:
$$\begin\:Healt} = \hfill \\\,\,\,\,\,\,\,\,\, + PO} + } + + + } \hfill \\ \end$$
(1)
In Eq. (1), \(\:_\) is the dependent variable of this paper, which represents the health status of individual \(\:i\) in city\(\:\:j\) in year\(\:\:t\). \(\:_\) is the independent variable in this article, indicating the degree of environmental pollution in the city \(\:j\) in year \(\:\:t\). \(\:_\) is a series of control variables, including medical level (lnMED), environmental regulation (ER), industrial structure (IND), and human capital (HC). \(\:_\) is the city fixed effect, \(\:_\)denotes the year fixed effect. \(\:_\) represents the random disturbance term.
Threshold effect modelThe threshold effect model is a standard method to describe the characteristics of skip or structural break and to test whether there is a threshold effect between variables [54]. Threshold effect refers to the phenomenon of a change in the direction or amount of an independent variable when the threshold variable reaches a certain inflection point value [54]. The threshold variable can be the independent variable itself [55] or another new variable [56]. This method was proposed by Hansen in 1999 [57]. When there is a nonlinear relationship between variables, the threshold regression model can divide the interval endogenously according to the characteristics of the data [56], avoiding the deviation caused by artificial division of the interval. Threshold regression has been widely used in the study of the nonlinear relationship between economic development and environmental pollution [55, 56, 58].
Previous studies have shown that the impact of environmental pollution on the health of middle-aged and elderly people varies at different stages [30, 32]. In order to further investigate the nonlinear relationship between environmental pollution and health, we refer to Hansen’s panel threshold regression model and construct a threshold regression model with environmental pollution itself as the threshold variable:
$$\begin\:Healt} = + PO}I\left( } \leqslant \:}1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } > }1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + } + + + } \hfill \\ \end$$
(2)
In addition, previous studies have confirmed that trade openness is associated with environmental pollution [33,34,35], and trade openness is also an influencing factor on health [38, 39]. However, research on the relationship between trade openness and health lacks a discussion from the perspective of the impact of environmental pollution on health. In order to further explore the mechanism of environmental pollution on the health of middle-aged and elderly people, based on Crossman’s theory of health production function, we introduce trade openness as a threshold variable into the health production function model with environmental pollution as the key input factor. And then we conduct regression analysis on how environmental pollution affects the health of middle-aged and elderly people with the change of trade openness. The model is as follow:
$$\begin\:Healt} = + PO}I\left( } \leqslant \:}1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } > }1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + } + + + } \hfill \\ \end$$
(3)
Before estimating the model, it is necessary to estimate the threshold number and threshold value of the threshold model. Based on the research idea of panel threshold model, we refer to the existing studies [54, 55] and use Bootstrap sampling method Bootstrap [59] to conduct the existence test of the panel threshold model with environmental pollution and trade openness as the threshold variables. That is, the following three sets of hypotheses are tested with two threshold variables respectively:
(i)HI 0: there is no threshold, HI 1: there is one threshold;
(ii)HII 0: there is only one threshold; HII 1: two thresholds exist;
(iii)HIII 0: there is only two threshold; HIII 1: three thresholds exist.
Equations (2) and (3) show the case when there is a single threshold effect. If the existence test of double threshold effect is passed, Eqs. (2) and (3) will be further extended to Eqs. (4) and (5), respectively. The models are as follow:
$$\begin\:Healt} = + PO}I\left( } \leqslant \:Th1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } \leqslant \:Th2} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } > Th2} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + \theta \:} + + + } \hfill \\ \end$$
(4)
$$\begin\:Healt} = + PO}I\left( } \leqslant \:Th1} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } \leqslant \:Th2} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + PO}I\left( } > Th2} \right) \hfill \\\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + \theta \:} + + + } \hfill \\ \end$$
(5)
In Eqs. (2)-(5), I(.) are the indicator functions, \(\:PO_\) and \(\:_\) are the threshold variables, \(\:Th1\) and \(\:Th2\:\)denote the threshold values to be estimated.
留言 (0)