Due to demographic change and technological progress in medicine, the demand for skilled nurses has increased in industrialized countries over the past decades (German Employment Agency, 2020a). This trend will continue in the coming years and will further aggravate the lack of nurses. In Germany, on average new job offers where vacant for 118 days before they could be staffed in 2019. However, in nursing, more than 175 days were needed to fill a position, indicating a lack of skilled workers (German Employment Agency, 2020b). Although limits in capacities where never reached in Germany, the lack of nurses has been more tangible than ever in the recent Covid-19 pandemic (Begerow et al., 2020). As a consequence quality in care decreased (Kaltwasser et al., 2021; Krebs et al., 2021). To counteract this development, it is important to analyze and to understand the occupational behavior of nurses. The existing literature discusses a series of factors that might alleviate the lack of skilled workers. These include individual preferences of (future) nurses, improving working conditions and increasing wages, as well as the trade-off between monetary and non-monetary job characteristics (e.g., Eberth et al., 2016; Scott et al., 2015). I contribute to this discussion by analyzing the effect of the beliefs about a nurse's wage of young students on the probability to become one. Wages are the most controversially discussed factor in the literature. Some authors identify it as a very important factor influencing labor supply decisions of nurses (Doiron et al., 2014; Hanel et al., 2014). However, others suggest that the labor supply of nurses is relatively inelastic in terms of wages. Factors such as personal attitude and working conditions seem to play a much larger role (McCabe et al., 2005; Shields, 2004).
Since there are large differences in earnings depending on the occupational choices, the economic literature on the effect of the expected wage is rich (Altonji et al., 2016). The majority of studies agree that the wage has a significant and positive effect on the career choice (e.g., Boudarbat, 2008; Montmarquette et al., 2002). Nonetheless, most studies find that preferences and interests play a larger role in career choice than the wage expectations (Arcidiacono, 2004; Beffy et al., 2012).
In line with the economic literature, the nursing literature suggests preferences and interests to be the most important factors influencing the decision to become a nurse. In particular, caring for people is identified as the key reason for choosing the profession (e.g., Matthes, 2019; Petrucci et al., 2016; Wilkes et al., 2015). Concerning the wage, several studies find that it only plays a minor role in the decision-making process (e.g., Bomball et al., 2010; Cho et al., 2010; McCabe et al., 2005). Based on these results, policy-makers might be tempted to focus on non-monetary factors to attract more young people into nursing. However, this contrasts recent work by Hanel et al. (2014) and Schweri and Hartog (2017). Schweri and Hartog (2017) examine the effect of ex-ante wage expectations on the decision to pursue a nursing degree (tertiary education) by using data on healthcare trainees (upper-secondary education) in Switzerland. Therefore, they analyze the decision on the intensive margin. Their results show that the greater ex-ante wage expectations of a nursing degree, the higher the probability to pursue such a degree later on. This indicates that higher wages may attract more students to become a high-skilled nurse. Hanel et al. (2014) estimate a model of labor supply decisions using data on individuals who hold a nursing qualification. The model accounts for the intensive and extensive margin by allowing individuals to enter and to exit occupations. As a result, they find a considerable high wage elasticity. This differs fundamentally from other work that detect very small elasticities (Andreassen et al., 2017; Shields, 2004). These differences can be fully explained by the frequent neglect of the extensive margin and the exclusive analysis of the intensive margin. Although Hanel et al. (2014) do not account for the choice of becoming a nurse, their results suggest that wages may heavily drive the career choice, that is, a decision on the extensive margin.
The focus of this paper is on beliefs about wages and how they influence the decision to become a nurse. Such beliefs may affect educational choices and could be easily changed by policy-makers—at least compared to other factors such as preferences. For example, Jensen (2010) analyzes perceived returns to secondary schooling of students in the Dominican Republic. He finds that the expected returns are underestimated. By providing information, students completed more years of education.
I use extensive data of former German ninth graders, 14- to 15-year-olds who are about to obtain a lower secondary degree. They have been followed since. It contains information on the wage that young students think a nurse, a hairdresser, a motor vehicle mechanic, a bank clerk, a teacher and a physician earns before occupational choice takes place. This information enables me to estimate the effect of the beliefs about a nurse's wage on the probability to become one. Moreover, I estimate the effect of other factors (e.g., social orientation) on the probability of choosing the profession of a nurse. This allows to assess the magnitude of the impact of the beliefs about a nurse's wage and to fit my results into the recent literature.
In addition, the data contains extensive background information on the individuals measured in ninth grade, that is, before their occupational decision took place. This covers not only educational and parental background but also measures for personality, competencies, interests and attitudes. Overall, the data allows to observe over 150 characteristics. By applying the lasso proposed by Tibshirani (1996), a method that draws coefficients toward zero or exactly to zero, I am able to select the relevant controls and to model non-linearities in confounding. However, the lasso is tailored to choose variables such that an outcome is precisely predicted. Therefore, it cannot be applied directly for variable selection, when the aim is to estimate a partial effect. As a solution, Belloni et al. (2012, 2014b) propose the post-double-selection (PDS), which is a two-step procedure to identify relevant controls and their functional form. To interpret the estimated effect as causal, I need to assume that no factors affecting the dependent variable and the variable of interest remain unobserved (unconfoundedness). Despite the ability to condition on a rich set of controls and flexibly model their functional form, this assumption is very strict and likely to be violated in some way. To mitigate the concerns about omitted variable bias and to get an idea about its consequences, I follow a novel approach by Cinelli and Hazlett (2020). For linear models, they propose to assess the minimal strength that unobserved confounding needs to have on the wage beliefs and on the career choice in order to change the conclusion. To this end, Cinelli and Hazlett (2020) propose a procedure for benchmarking based on observed covariates. The knowledge about main predictors for career choice or the wage beliefs is the crucial premise for the benchmarking to be valuable. Fortunately, literature on determinants of wage expectations and factors driving young people into nursing is rich. Thus, credible benchmarking on observed covariates is possible.
My results show that higher beliefs about a nurse's wage increase the probability to become a nurse. In line with recent literature, individual preferences play a larger role than the beliefs. Since the career choice is a decision on the extensive margin my results are also consistent with those of Hanel et al. (2014). Further, the results show that effects are driven by young people who do not become a nurse and underestimate the wage. This is consistent with recent literature. Although information is publicly available, educational choices are made under uncertainty and incomplete or incorrect information (Finger et al., 2020; Hastings et al., 2016; Oreopoulos & Dunn, 2013). With regard to nursing, Dante et al. (2013) find that especially students who do not become a nurse basically know nothing about it (e.g., understate initial wages). The results of this paper indicate, that the public perception of wages in nursing is too low. Therefore, nursing is less attractive than other occupations for which wages are not systematically understated. To combat the lack of skilled nurses, policy-makers can make the profession more attractive by increasing the beliefs about a nurse's (relative) wage.
The remaining paper is structured as follows. Section 2 outlines the methods applied in the empirical analysis and briefly describes the data, the wage belief measures as well as the control variables. In Section 3, I present and discuss the main results of my analysis. Section 4 concludes.
2 METHODS 2.1 Empirical strategy 2.1.1 Post-double-selection The partial effect of the wage belief on the probability to become a nurse is estimated by a partially linear model (1)where denotes the binary choice to become a nurse. The function is unknown and potentially complicated. I approximate it by a linear combination that may include higher order polynomials and interactions (2)where is an approximation error. The aim is to estimate . However, it is a difficult task to define a set of variables to be included in the model and to model their functional form (i.e., what polynomials and interactions to include). Therefore, I rely on data-driven variable selection and follow the PDS approach proposed by Belloni et al. (2012, 2014b). The lasso is a shrinkage method that imposes a penalty on the size of the coefficients, that is, shrinks them toward zero or exactly to zero. This prevents models with many variables that are correlated with each other from overfitting (Hastie et al., 2009). The lasso is defined as (3)where imposes the penalty on the size of the coefficients and the parameter controls the magnitude of the punishment. A naive approach to estimate would be to apply the lasso estimator to Equation (1) and to exclude from the penalty term such that it is enforced to stay in the model. Afterward one might use a least-squares regression of the outcome on and controls with non-zero coefficients. However, this approach leads to biased estimates because of omitted variables. The lasso is designed to learn a forecasting rule of given and and not to learn about the relationship between and given controls (Belloni et al., 2014a). Therefore, lasso cannot be used off the shelf for the estimation of partial effects. As a solution, Belloni et al. (2012, 2014b) propose an intuitive and easy-to-implement procedure. First, the lasso is used to estimate a model predicting the outcome given in Equation (4) and a further model predicting the wage beliefs given in Equation (5) (4) (5)Subsequently, all variables with non-zero coefficients in either of the two models are kept as control variables in order to estimate in Equation (1) by an ordinary least-squares regression. This step is known as the “post-lasso.” The crucial assumption under which PDS works is approximate sparsity. It states that the wage belief and the career choice can be approximated by Equations (4) and (5) using only a small number of covariates relative to the sample size. Note, that approximate sparsity is also implicitly assumed in conventional ordinary least squares (OLS) analysis where no double selection by lasso takes place. Additional variables that are considered as important for ensuring robustness, can be included (amelioration set). The condition is that the amelioration set is not substantially larger than the number of variables chosen via the lasso (Belloni et al., 2014b).
The choice of is of importance. With the aim of prediction, standard lasso applications choose by cross-validation. However, this analysis aims to estimate a partial effect. If is too large, only a few variables are selected and omitted variable bias may occur. If is too small, the number of variables is very large such that overfitting may become an issue. Therefore, I follow Urminsky et al. (2016) and use , where is the number of observations, is the number of potential controls, denotes the inverse cumulative function of the standard normal distribution and the standard deviation of the residuals of the model. Finally, it is important to note that the chosen variables are not interpretable since selection depends on the sample (Mullainathan & Spiess, 2017). Hence, when presenting the results, chosen control variables are not reported.
2.1.2 SensitivityIn order to interpret the partial effect as causal, I need to rely on the assumption of unconfoundedness . It states that all factors that affect the choice and the wage belief at the same time must be contained in . Even though I have access to an extensive set of potential controls , bias due to unobserved confounders may be likely. For example, covariates measuring the interests of the individuals might not fully capture all relevant aspects but only a share of it. Further, it cannot be ruled out, that some factors may remain fully unobserved. Moreover, the assumption of approximate sparsity may be violated. There may exist covariates that are not selected by lasso but affect both, the wage belief and the decision to become a nurse. To analyze the sensitivity of the results due to potentially unobserved (non-)linear confounding factors , I make use of a procedure proposed by Cinelli and Hazlett (2020). In a nutshell, they propose to assess the sensitivity of the estimates by analyzing whether a confounder is strong enough to change the conclusion if it is as strong as a very good predictor of or .
Conventionally, the omitted variable bias can be written as . Hence, describes the difference in the linear expectation of the outcome if changes by one unit, holding everything else constant and describes the difference in linear expectation of the confounder if the variable of interest changes by one unit, holding everything else constant (Cinelli & Hazlett, 2020). Arguing that both quantities and are hard to grasp, Cinelli and Hazlett (2020) write the conventional omitted variable bias formula in terms of partial measures. Those are easier to interpret and can be exploited for further analysis. Denote as the observed estimated effect and as the estimated effect from a model controlling unobserved confounding factors, that is, . Then, they show that (6)where defines the degrees of freedom, stands for the partial of regressing on after controlling for and and denotes the partial of regressing on after controlling for . Further, the standard error of can be written as (7)and the adjusted t-statistic is defined as . Applying these definitions, and can be computed by substituting reasonable values for and , that is, the strength of confounding, into Equations (6) and (7). However, actual knowledge about the absolute strength is seldom available. As a solution, Cinelli and Hazlett (2020) argue that the researcher is often able to make a statement on the relative strength of potential unobserved confounding, for example, cannot account for as much variation of the outcome as some observed covariate . There are several ways to formalize such claims. I follow Cinelli and Hazlett (2020) and claim that I measure the key determinant of and such that the omitted variable cannot explain as much residual variance in or as this determinant. Define (8) (9)where is a vector including all variables contained in , excluding . The ratios and show how much of the variance in or is explained by relative to the explanatory power of , conditional on all other covariates. In this paper , that is, I consider the impact of a confounder that is as strong as . Given and , Cinelli and Hazlett (2020) show that (10)where is a scalar that depends on , , and . Furthermore, denotes partial Cohen's of on and denotes partial Cohen's of on .1 Cinelli and Hazlett (2020) have shown that these robustness results are exact for a single linear confounder and conservative for multiple, possibly nonlinear, confounding factors.It is important to emphasize that this bounding procedure heavily relies on the choice of the benchmark variable . If it is not true that is a key predictor of the outcome or treatment, the bounding is pointless. Hence, domain knowledge is necessary (Cinelli & Hazlett, 2020). In the following, I choose observed covariates that are often discussed in the literature. First, bounding is based on social orientation, that is, preferring activities to inform, train, educate, cure, or advise other people (Wohlkinger et al., 2011). It is the key characteristic of those who become a nurse (e.g., Matthes, 2019), while preferences are generally a decisive factor in career choice (e.g., Arcidiacono, 2004). In addition, interests also play an important role in the formation of expected wages (Wiswall & Zafar, 2015). Second, the professions of the parents play an important role in the occupational choice (e.g., Knoll et al., 2017). Therefore, the results are bounded by an indicator that indicates whether at least one of the parents is a nurse. Moreover, parents in nursing might inform their children about the wages in nursing. Third, an indicator for gender is considered. Females become nurses much more often than males (Speer, 2019). Moreover, gender also plays a crucial role in wage expectations: females expect lower wages than males (e.g., Brunello et al., 2004; Fernandes et al., 2020). Fourth, (perceived) ability determines the expected wages (Brunello et al., 2004). Therefore, a measure for ability, namely metacognition, is used to bound the results. Note, that these variables have to be part of the model in order to use them as benchmark variables. Hence, the amelioration set contains these four variables, to ensure that they are not excluded by data-driven variable selection.
2.2 Data This study uses Starting Cohort Four (SC4) of the German National Educational Panel Study (NEPS). The survey collects data on 14- to 15-year-olds who attended the ninth grade in German regular schools in 2010 and has been followed since (Blossfeld & von Maurice, 2011). This includes grammar schools, middle secondary schools, lower secondary schools, comprehensive schools, and schools offering all tracks of secondary education except the grammar school track. Since becoming a nurse requires a vocational training, all degrees that can be obtained at these schools are sufficient to be admitted. For several reasons the data is highly suitable for investigating the role of beliefs about a nurse's wage in the decision to become one. Since the data is available from 2010 to 2016, the transition from school to further education can be observed in great detail and no retrospective information has to be used. The following analysis is based on a cross section of the panel and focuses on the choice of the first occupational training, which certainly has a relevant impact on the further life course. The German education system is highly complex. Although it is a sequential system, there are many different paths through it that lead to the same degree, that is, to down- or upgrade educational paths (Biewen & Tapalaga, 2017). Upon completion of ninth grade, individuals obtain the lowest educational degree. Afterward, individuals may start a vocational training or, alternatively, stay in the educational system, which in turn may affect the individual characteristics and occupational choice later on. In order not to condition on future outcomes, the cross section contains information of individuals that is measured in ninth grade.2 Beyond that, the individuals are asked to state their beliefs about the monthly salary of a nurse, a hairdresser, a motor vehicle mechanic, a bank clerk, a teacher and a physician: “Now we are also interested to know how high you think the income is in certain professions. What do you think the monthly income as a […] is?” Consequently, the question at hand captures knowledge about average wages, knowledge of wages according to collective agreements, but also wrong beliefs due to the lack of information or wrong perceptions of wages. Since a central principle of the human capital theory suggests schooling decisions are made by comparing the benefits with the (opportunity) costs of the decision, the analysis focuses on the relative wage beliefs, that is, the stated nurse's wage relative to stated wages of other occupations (Boudarbat, 2008; Carneiro et al., 2011). To this end, the stated wages of all six occupations are ranked from lowest to highest. If the wage cannot be assigned unambiguously due to ties, the mean rank is assigned such that the sum of ranks is preserved. Formally, I define the individual's rank of a nurse's wage as (11)where denotes an indicator function that takes the value 1 if the expression in the parentheses is true, is the set of surveyed wage beliefs and is the belief about a nurse's wage. Two further measures are defined as the ratio between individual's beliefs about a nurse's wage and maximum as well as minimum stated wage (12) (13)Further, I also analyze the belief of a nurse's absolute wage.
Based on the ranking measure in Equation (11), I can easily assess how close the relative wage beliefs are to reality by computing the deviation from the true ranks. The median wages reported by German Employment Agency (2018) provide the basis for the true rank. According to this source of information, the following true ranking from lowest to the highest wage was established: (1) hairdresser, (2) motor vehicle mechanic, (3) nurse, (4) bank clerk, (5) teacher and (6) physician. The ranking is utilized to construct a measure that captures the knowledge about relative wages by adding the absolute deviations of the stated rank of each occupation (14)where the ranks of each occupation are computed the same way as the rank of a nurse's wage. Additionally, I can construct indicators that show whether the wage rank of a nurse is overestimated, correctly estimated or underestimated (15) (16) (17)Besides information on wage beliefs, there are other potentially important factors available that may drive young people into or out of nursing (see Wohlkinger et al., 2011). This enables me to assess the importance of the wage belief by comparing the effect with other effects estimated in the literature. A large share of recent work analyses the effect of working conditions. Hence, I look at the effect of desired work conditions using the survey instrument provided by MOW International Research Team (1987). Since literature finds that those who become nurses do not rate the importance of economic factors as important as those who choose another profession, I use a measure of the importance of economic factors (i.e., risk of unemployment and financial aspects) in choosing a career. In addition, I estimate a model that uses an indicator of the desired comfort of the job (i.e., physical working conditions and good working hours). Finally, helping others is considered to be one of the main driving forces i
留言 (0)