Comparing effects of Euclidean buffers and network buffers on associations between built environment and transport walking: the Multi-Ethnic Study of Atherosclerosis

Below, we first describe the empirical dataset and the empirical analysis. Next, we describe how the empirical dataset was used to derive the simulated dataset and we described the simulation analysis.

Empirical studyMulti-Ethnic Study of Atherosclerosis

Transport walking data and personal sociodemographic information came from Multi-Ethnic Study of Atherosclerosis (MESA), of which 6814 adults aged 45–84 years participated in the survey between July 2000 and August 2002 in six study sites in the U.S. (Los Angeles, California; Chicago, Illinois; Saint Paul, Minnesota; New York City, New York; Baltimore, Maryland; Forsyth County, North Carolina) [3]. MESA was approved by the institutional review boards at all participating institutions, and all participants gave written informed consent. Although MESA is a longitudinal study, we focused on the data at Exam 1 (between July 2000 and August 2002) because we were interested in investigating differences in estimated associations between built environment exposures and transport walking when exposure metrics were based on Euclidean buffers or network buffers—i.e., the longitudinal aspect of MESA did not add additional information on the question of interest in this study.

Outcome: transport walking minutes per week

The survey asked participants whether they had engaged in walking activities (walking to get to places e.g., to the bus, car, work, to stores) during a typical week in the past month. If yes, they were asked to report how many days per week and how much time per day they spent walking. The outcome variable transport walking minutes per week was computed by multiplying the number of days for transport walking per week by the number of minutes for transport walking per day. We log-transformed the outcome variable because it was skewed to the right.

Exposure: built environment

Built environment data came from the National Establishment Time Series (NETS) database, obtained through the MESA Neighborhoods study and Retail Environments for Cardiovascular Disease (RECVD) project (https://sites.google.com/view/recvd-team-project-site/home). Population density came from the 2000 U.S. Census (https://www.census.gov/). Road network data were obtained from ESRI Business Analyst 2016.

Built environment features included a broad category of walkable destinations, consisting of common and popular destinations for daily life (e.g., food stores, restaurants, drug stores and pharmacies, department stores, post offices, banks/credit unions, libraries, beauty shops and barbers, social/entertainment destinations, museums, schools). We further included one subset of the broad category of the walkable destinations: frequent social destinations to examine the variations in density for different built environment destinations. Frequent social destinations consisted of destinations that facilitated social interaction and promoted social engagement (e.g., beauty shop/barber, libraries, non-physical activity recreation clubs, religion). The detailed list of all the destinations is shown in Additional file 1: Table S1 (ST1). The exposure was defined as the number of destinations within the corresponding spatial context (either network or Euclidean buffer).

Table 1 Descriptive statistics for the MESA sampleCovariates

Covariates were selected based on prior MESA studies which examined associations between transport walking and built environment features [13, 19]. Person-level covariates included age, sex, race/ethnicity, and education, income-wealth index, employment status, household car ownership, body mass index (BMI), self-rated health compared with others of the same age, and arthritis flare-up in the past 2 weeks. The income-wealth index was specified as a 9-point scale (0 being the lowest level of income and no assets and 8 being the highest level of income and all 4 assets). Details about the index are shown in the note of Table 1 and the index was described previously in depth [12]. Area-level covariates included population density in a 1-mile Euclidean buffer, street network ratio in a 1-mile Euclidean buffer, and region (from census categories: Northeast, Midwest, South, West). To calculate population density in a 1-mile Euclidean buffer around residence, first we used the 'intersect' geoprocessing tool in ArcGIS [GIS software] (Version 10.5. Redlands, CA: Environmental Systems Research Institute, Inc., 2016) to estimate the area weighted population for block groups/pieces of block groups within a 1-mile Euclidean buffer of each participant and then we divided the total population by the area of the buffer. Street network ratio in a 1-mile buffer was calculated as the ratio of the area of a 1-mile network buffer to the area of a 1-mile Euclidean buffer around each participant’s residence. The ratio varies between 0 and 1, with 0 meaning none of the area can be reached through the road network and 1 meaning the entire area can be reached through the street network (i.e., the highest level of network ratio).

Sample inclusion

There were 6191 MESA participants who agreed to participate in the MESA Neighborhood Study at Exam 1. We retained 5839 participants who had historical addresses and had geocoded addresses with accuracy at street and ZIP + 4 level. We excluded 15 participants who did not report transport walking minutes. We also removed 68 participants who did not have complete sociodemographic variables. The final analytical sample consisted of 5756 participants.

Simulation studyOverview

Simulations were used to provide systematic evidence regarding how the two buffer methods (true exposure in network buffers and exposure in Euclidean buffers) perform on average. Because researchers have full control of how the data are simulated, the correct/true answer is known by design [1]. Thus, our simulations assessed which of the two methods came closest to recovering the ‘correct’ answer.

In order to add realism to the simulated dataset, simulations were based on the locations of MESA study participants and the spatial locations of amenities near participants. The outcome data transport walking minutes per week were generated according to a linear regression model (described below). Note that to generate the transport walking minutes per week, we used the observed built environment exposures in network buffers as the true exposure because buffers delineated through street network may be assumed to be a more precise representation of access by walking or other active travel [22].

Simulation design

We simulated transport walking minutes per week under a variety of settings in which the outcome was dependent upon spatial accessibility and a binary covariate. Simulations were designed to examine bias in built environment-outcome association estimates resulting from using Euclidean buffer counts as the observed predictor (denoted \(^\)), but outcomes were generated using network buffer counts as the true predictor (denoted \(^\)). That is, for a subject i, we generated data from a model: \(_=\alpha +}}}_}}+\beta _^+_\), and sought to determine the degree of bias in estimates of the \(\beta\) coefficient when \(_^\) is used instead of \(_^\). To obtain generalized understanding of patterns in the degree of bias, we used 72 simulation settings, which arose from the combinations of: three spatial scales (0.25 km, 1 km, 5 km), two types of built environment features (BEF), two effect sizes (smaller/larger), and six geographic contexts (the six MESA sites).

We chose 0.25 km, 1 km, and 5 km to represent a small, a medium, and a large spatial scale, respectively. These distances were selected because they align with prior work in this field and/or can be justified. We employed a spatial–temporal aggregated predictor (STAP) modeling [24] to detect how associations between walkable destinations and transport walking varied across distances, and found that associations were negligible for distances larger than 0.25 km, which was in line with findings in a previous MESA study that smaller spatial scale of 0.2 km had stronger effects than larger ones [25]. Prior work among adults has widely used 1 km (equivalent to a 10–15 min walking distance) to represent the size of a residential neighborhood [8, 11, 20, 23], and 5 km (~ 3.11 mile) represents the maximum distance because most US residents are unwilling to walk for transport farther that this [31].

We examined two classes of overlapping BEFs: Walking Destinations (WD) and Frequent Social Destinations (FSD, a subset of the former) to ensure differences in results from a dense (WD) vs. less dense (FSD) BEF were not due to differences in the spatial distribution of the features. We used a smaller (0.05) and larger (0.1) built environment effect, \(\beta\), to examine if biases depended on the magnitude of the effect size. For each of the 72 simulation settings, we simulated 5000 datasets.

Analysis of simulated data

For each simulated dataset, we employed the count of built environment destinations in the network or Euclidean buffers as predictors in separate models, and estimated their association with the generated outcome. For example, to estimate the association between WD in 1 km network buffers and outcome, we fitted \(E\left[_\right]=\alpha +}}}_}}+__^\). Similarly, we fitted a similar model using the WD counts within 1 km Euclidean buffers \(E\left[_\right]=\alpha +}}}_}}+__^\). We evaluated the performance of the two buffer metrics in estimating associations by comparing estimates to the true value and averaging across simulations \(\frac_^(}_-\beta )\), as well as comparing the differences in associations, namely \(\frac_^(}_-}_)\).

We obtained an estimate of the bias in the coefficient estimates by comparing the estimate obtained in each dataset to the true value and averaging across the 5000 simulated datasets for a given scenario. We standardized the bias into a percent, by dividing the average bias by the true coefficient, and plotted the percent bias to visualize the results. For each simulation scenario, we also obtained an estimate of power to detect significant associations by calculating the percent out of the 5000 datasets where the confidence interval for β did not contain zero.

Empirical analysis of MESA data

We calculated descriptive statistics for all MESA study variables. We focused on quantifying the extent of differences in the built environment exposures when they were measured using Euclidean or Network buffers with varying buffer sizes (0.25 km, 1 km, and 5 km) among MESA participants. Differences in exposure metrics were summarized as median, the first quartile, and the third quartile. We estimated associations between transport walking and built environment exposures assessed using network buffers and Euclidean buffers. Although we used three spatial scales for the buffers: 0.25-km, 1-km, and 5-km, we ultimately focused on 0.25-km buffer models as the health effects of walkable destinations beyond 0.25-km (in network distance) were weak.

留言 (0)

沒有登入
gif