The skin is no barrier to mixtures: Air pollutant mixtures and reported psoriasis or eczema in the Personalized Environment and Genes Study (PEGS)

Cohort description

The Personalized Environment and Genes Study (PEGS) is a diverse cohort with extensive health and exposure data at the National Institute of Environmental Health Sciences (NIEHS), (https://www.niehs.nih.gov/research/clinical/studies/pegs/index.cfm). Participants are mostly located in North Carolina (NC) with the remainder scattered across the contiguous United States.

The cohort began as a research registry in 2002 with recruitment from university campuses, health clinics and fairs, and volunteer study drives. From 2013 to 2020, subjects were administered the PEGS Health and Exposure Survey and two NIEHS PEGS Exposome Surveys (A and B).

The PEGS Health and Exposure Survey was formulated based on validated surveys like the National Health Information Survey and the National Health and Nutrition Examination Survey (NHANES) [20]. The Health and Exposure Survey and interactive tools to further inspect summary data can be found at https://www.niehs.nih.gov/research/clinical/studies/pegs/index.cfm.

The Health and Exposure Survey asks participants if they had a physician diagnosis of autoimmune diseases including multiple sclerosis, hyperthyroidism, hypothyroidism, Celiac, Crohns, ulcerative colitis, scleroderma, lupus, Sjogren’s syndrome, Raynaud’s phenomenon, pernicious anemia, myositis, rheumatoid arthritis, unspecified arthritis, psoriasis, and eczema. If subjects responded affirmatively, they were considered as having autoimmune disease.

Table 1 describes detailed characteristics of the PEGS cohort subjects.

Table 1 Characteristics of Cohort.

Our analysis included 9060 subjects with complete addresses. They were mostly middle-aged, 42 (15.8), with racial backgrounds reflecting the demographics of North Carolina residents [21]. The cohort is mostly female (67%) and relatively high-income. Almost 40% of subjects report smoking. Autoimmune diseases are relatively common with 35% reporting at least one of the above diseases. The prevalence of psoriasis and eczema is (4.2%) and (9.8%) respectively and together represent 1128 subjects (12.5%). Our outcome definition is a diagnosis of psoriasis and/or eczema.

Air pollutant exposure estimation

We utilized two sources of pre-existing air pollutant data, one from the Center for Air, Climate, and Energy Solutions (CACES) https://www.caces.us/ and the other from the Atmospheric Composition Analysis Group (ACAG) http://fizz.phys.dal.ca/~atmos/martin/?page_id=140. The CACES estimates derive annual mean concentrations of the six criteria air pollutants by census tract in the contiguous United States from monitoring sites and approximately 350 geographic covariates in an integrated empirical geographic regression model [22]. Criteria air pollutant (NO2, SO2, O3, PM2.5, PM10, and CO) concentrations are monitored by the U.S. Environmental Protection Agency due to their known detrimental effects on health and are managed nationwide to comply with air quality standards. We linked the participant address data to the CACES data by selecting the nearest population-weighted census tract centroid to the participant address due to the format of the CACES data [23].

The ACAG estimated annual mean fine-level particulate matter (PM2.5) mass and compositional mass (total mass µg/m3) across North America using Aerosol Optical Depth (AOD) estimates and the GEOS-Chem chemical transport model. PM2.5 composites include ammonia (NH4), sulfate (SO4), black carbon, nitrate, organic matter, soil, and sea salt. The estimates were calibrated to the ground-level using Geographically Weighted Regression (GWR) [24]. The ACAG values were matched to participant addresses using the raster R package’s extract function [25].

Since our subjects only had one address reported without information on the timeframe of residency and no temporal linkage to our outcome of interest, we chose to utilize the mean value across the years 2000–2015 for both the CACES data and for the ACAG data, see Fig. 1. Averaging over this period allowed for a reasonable estimates of typical air pollutant exposure following the establishment of the PEGS cohort registry. While both CACES and ACAG models utilize a wide-range of variable inputs, their output is generalized to a mid-range spatial resolution (census tract and 0.1◦lat/long grid). In order to increase the individual-level scale representation of pollutant exposures, we included two more exposure estimates – the density of major roads from the United States Geological

Fig. 1: Maps of pollution intensity for each modeled constituent of the mixture.figure 1

For comparability, the pollutants have been normalized, and eight quantiles were determined for each pollutant. Quantile 4 represents the mean exposure in the cohort. Each dot represents an approximate home address for subjects living in and around North Carolina although there are subjects living across the United States. Please see Table 2 for ranges of the original exposure values. Maps developed using leaflet package in R.

Survey (USGS) https://pubs.usgs.gov/dds/dds-059/export/metadata/akrds2mg.htm and the concentration of volatile organic compounds (VOCs) from the Toxic Release Inventory (TRI) https://www.epa.gov/toxics-release-inventory-tri-program. The VOCs included were benzene, ethylbenzene, toluene, and xylene (BTEX), commonly monitored constituents of petroleum products and representative of a diversity of sources of VOCs [26, 27]. The density of major roads was defined as the total length of roadways in a 5 km buffer around the participant address using the sf package in R [23]. We estimated for a 5 km buffer around the participant address the sum of the exponentially decayed mass (lb) from each TRI site based on the distance from the site to the participant’s address, see Eq. (1). The initial mass value is mean annual concentration at a site from 2000–2016.

$$X_i^k = \mathop \limits_^ }}}}}}}\left( }}}} \right)}$$

(1)

where Xik is variable X for location i and source type k, C0j is the initial concentration at source j assumed to be the annual air release in kg as reported by the TRI, dij is the Euclidian distance between monitoring site i and source j, ar is the exponential decay range, and nk is the number of source of type k [28].

Mean, standard deviations, and ranges for the exposures included can be found in Table 2, while comparative quantiles of pollutants are mapped in Fig. 1.

Table 2 Exposure quantity ranges in cohort with units specified.

We note in Fig. 1 that many pollutants such as the BTEX chemicals, SO4, and NO2 cluster around the urban corridors in North Carolina. Of further interest, nitrate and ammonia (NH4) constituents of PM2.5 have spatial patterns following the density of confined area feeding operations where animal waste is highly concentrated [29]. We also note a wide range of exposures for home locations in the cohort.

Covariates

We used a directed acyclic graph (DAG) to show covariates in relationships with an approach modeled after a causal inference framework; however, there are assumptions that our dataset does not meet for inference of causal estimates [30]. Nonetheless, the approach allows explicit representation of perceived factors modifying our outcome.

The covariates that we included in the model describe potential pathways of exposure that influence our outcome. Additionally, there are likely unmeasured confounders that impact known exposures included in the model. We included the postulated relationships of exposures and other covariates to the outcome in the DAG [31], see Fig. 2.

Fig. 2: Directed acyclic graph showing hypothesized relationships between air pollution, our included covariates, and our outcome of autoimmune skin diseases.figure 2

The green circle with arrow-head represent our primary mixture exposure; the green circles represent the air pollution exposures; grey circles represent air pollution sources or mitigation factors; the blue circle with an I is our outcome of interest, and the plain blue circles are biological covariates. Pink represents confounders and shows that the minimally sufficient set of covariates is the subject’s age.

The selected exposures describe the influence of population density that drives traffic intensity, industrial activity that drives the production of VOCs with both driving the quantity of criteria air pollutants. CACES in its estimation also includes greenspaces which have been shown to modify the intensity of air pollutants in neighborhoods [24].

Socioeconomic status is an important determinant of home location which is used as a proxy for pollutant exposure. Socioeconomic status is also co-linked with career and therefore career-related exposures. While the PEGS cohort does have substantial data gathered on career-related exposures, they are numerous and poorly defined for the scope of this analysis. For sensitivity analysis, we included a dichotomous variable indicating whether subjects had incomes of more than $30,000. This cutoff approximated living wage calculations in NC as of 2019 [32], see table S2 and https://livingwage.mit.edu/pages/about and poverty levels across the U.S. for households of 4–5 people [21] (https://www.census.gov/data/tables/time-series/demo/income-poverty/historical-poverty-thresholds.html).

Autoimmune diseases have strong genetic linkages and to reflect this, we included whether subjects had one or more parent with rheumatoid arthritis or if a subject reported other autoimmune conditions, inclusion was also considered in sensitivity analyses, see Table S1. Age and gender were included to adjust for the sample distribution. Age is also a confounder as it is associated with the amount of overall exposure to the pollutant mixtures and the probability of diagnosis and is considered the extent of the minimally sufficient set of covariates to elucidate the pathway of interest.

Smoking status and history are associated with autoimmune disease and severity as well as most health conditions. Smoking exposure is also a major contributor to oxidative stress and therefore was adjusted for in this analysis [33]. PEGS subjects were asked about current smoking and smoking history. We included an indicator of whether subjects had reported smoking more than 100 cigarettes in their lifetime to adjust primarily for habitual smokers.

Statistical analysis

To evaluate the joint association of a mixture of air pollutants with autoimmune skin disease, we utilized quantile g-computation as described by Keil et al. 2020 [19]. The outcome, self-reported diagnosis of psoriasis and/or eczema is a binary time-fixed variable, hence we estimated mixture effect within a logistic regression structure.

G-computation approximates potential counterfactual situations by estimating the outcome in the circumstance for all possible values of exposure through resampling the dataset. Under the assumptions of no unmeasured confounders and exchangeability, g-computation fits the actual data to a model and then predicts outcomes for potential counterfactual data [34]. Once the counterfactual dataset is created, a marginal structural model is estimated by regressing with the data including the counterfactuals [34]. Therefore, the model based on collected data is adjusted for data-estimated counterfactual situations. Quantile g-computation is a mixtures-motivated extension of g-computation that estimates the effect of simultaneously increasing each pollutant by one quantile (q) for a subject i of exposure j. In Eq. (2), \(X_^q\) represent the geospatially-linked quantity of air pollutants estimated at subject i’s home address and Zik represent the k other covariates including gender, age, smoking history, and family history of autoimmune conditions (one or more parent with rheumatoid arthritis). αk being the coefficient estimates for the non-exposure covariates.

$$Logit\left( }}}}}}}X^q} \right)} \right) = \beta _o + \mathop \limits_^d ^q + \alpha _kZ_ + \varepsilon _i}$$

(2)

The final regression estimates the marginal effect of the mixture, ψ, for a quantile increase in the combined exposure variables where \(\psi = \mathop \nolimits_^d \) [19]. To establish consistent and comparable quantiles, all exposures were first scaled to a zero mean and then divided into 8 quantiles. Octiles are a comprehensible unit and allow for adequate variability to be expressed between quantiles without losing substantial information. The test statistics and confidence intervals were then derived from 1000 bootstrap samples. All models were run using the R package, qgcomp 2.8.6 [35].

留言 (0)

沒有登入
gif