Estimating the population variance, standard deviation, and coefficient of variation: Sample size and accuracy

Small sample sizes are not uncommon in human and nonhuman primate evolutionary research. Such samples are often relied upon to estimate population parameters such as the mean, variance, and standard deviation. Accurate estimates of population parameters are essential for interpreting species representation in fossil assemblages, and making meaningful inferences regarding genetic, morphological, behavioral, biomechanical, hormonal, ecological, and isotopic variation, both within and among fossil and extant human and nonhuman primate populations, lineages, and taxa. Here, we refer to accuracy as closeness of the sample estimate to the true population parameter. To date, surprisingly little research has been conducted within biological anthropology on the accuracy of sample estimates of population parameters based on small sample sizes, and there have been few attempts to apply established relationships between accuracy and small sample sizes. Instead, biological anthropologists have typically relied on conventional 95% confidence intervals constructed post hoc from the sample estimate. Previously, research has shown that, assuming a random sample, the population mean can be estimated confidently with sample sizes of fewer than 10 observations (Schillaci and Schillaci, 2009). The accuracy of estimating the population variance and standard deviation with small sample sizes, however, has not been similarly evaluated.

Metric variation is often an important component of what is analyzed quantitatively in studies of taxonomy and evolutionary process. Metric variation in phenotypic characters such as tooth or bone dimensions has been used for identifying how many species comprise a fossil assemblage, and species identification has in the past relied on the sample standard deviation (SD), which is the square root of the sample variance (s2), to calculate the coefficient of variation (cv) as a measure of the relative amount of phenotypic variation (cv=SD/x¯). As discussed in detail by Plavcan and Cope (2001: 210–215), the coefficient of variation has been a popular measure of relative variation in paleontology. Because the cv calculated from small sample sizes can be biased to underestimate relative variation, a bias adjustment is often applied (i.e., cv=SD/x¯1+1/4n; Plavcan and Cope 2001). Based on an extant comparative sample, Simpson et al. (1960) proposed a threshold value of cv ≥ 10 for identifying when a fossil assemblage comprises multiple species. Although this threshold has not been commonly used in more recent research, sample estimates of the cv from extant and fossil samples are still sometimes used to identify when multiple species may be present in a given assemblage, fossil or extant (e.g., Cuozzo et al., 2013). Accurate sample estimates of SD, and by extension the cv, are therefore needed to infer reliably whether an assemblage comprises one or multiple species.

Quantifying phenotypic variation is also important in quantitative genetic research on microevolutionary process and population structure. It is well recognized that the additive genetic variance is reflected in phenotypic variance, making it possible to study evolutionary processes in polygenic phenotypic traits (Williams-Blangero and Blangero, 1990). The univariate sample variance (s2) is typically used to estimate the population variance (σ2) of a phenotypic trait, which serves as a proxy estimate of the population additive genetic variance for that trait. The univariate population variance for a single polygenic phenotypic trait (σP2) comprises additive genetic (σG2) and environmental (σE2) variance components (σP2=σG2+σE2; Williams-Blangero and Blangero, 1989). The heritability of a trait, which is the degree of phenotypic resemblance between parents and offspring, is defined as either the ratio of the additive genetic variance to the phenotypic variance (narrow-sense heritability: h2=σG2/σP2), or the ratio of the total genetic variance (σT2) to phenotypic variance (total heritability: H=σT2/σP2; Lande, 1976, 1979). For most studies that use the phenotypic variance as a proxy measure of the additive genetic variance, the sample variance s2 serves as the best estimate of the population parameter σP2. Accurate sample estimates of the population phenotypic variance, therefore, are needed to estimate both σP2 and heritability, as well as the amount of heritable variation (h2 × σP2), subject to natural selection (e.g., Lande, 1976).

Multivariate extensions of the univariate examples presented above rely on the phenotypic sample variance-covariance matrix P to estimate the additive genetic variance-covariance matrix G, with G being equal to the difference of the phenotypic and environmental variance-covariance matrices (G = P − E; P = G + E; Lande, 1979; Williams-Blangero and Blangero, 1990), or the additive genetic variance-covariance matrix G being equal to P after adjusting for trait heritability (i.e., G = h2 × P; Cheverud, 1988), assuming no or equivalent environmental effects (Lande, 1979). The diagonal of matrix P comprises sample estimates of the population variance for each phenotypic trait, whereas the off-diagonal elements of P are the pair-wise sample estimates of the covariances among traits. Accurate estimates of P, therefore, depend in part on sample estimates of the population variance. Given the importance of phenotypic variance in quantitative genetic studies of microevolutionary process, accurate sample estimates of the population variance are therefore needed. The accuracy of such estimates is dependent on sample size, among other factors relating to sampling bias. Although sampling effort and the accuracy of phenotypic–genetic correlations and covariances has received attention in the literature (e.g., Cheverud, 1988; Roseman et al., 2010; Grabowski and Porto, 2017), sample size and accuracy in univariate estimates of variance, e.g., the diagonal elements of P, has not.

In the following report, we use a conclusion from Cochran's theorem to determine the theoretical probability (confidence) that the sample variance and standard deviation are accurate estimates of the population parameters, which are unknown, based on a range of sample sizes. To demonstrate the method, we choose an arbitrary probability of 95% as a threshold for acceptable confidence. This threshold is analogous to the commonly used 95% confidence interval, with the difference being that the conventional 95% confidence interval is constructed post hoc from the sample estimate, whereas our determination of confidence is made a priori based on sample size and a specified level of accuracy. We note that the conventional 95% confidence interval commonly used in statistics is also arbitrary. We choose an arbitrary threshold for accuracy of ±20% of the population parameter. As a percentage range (±%) from the true population parameter, this threshold for accuracy can also be viewed in terms of inaccuracy, where inaccuracy = imprecision + bias2 (Grabowski and Porto, 2017). So, as accuracy increases, the sum of imprecision and the square of bias decreases. Although ±20% could be considered to represent an upper limit for reasonable (in)accuracy, to our knowledge there are no established or conventional standards within biological anthropology, or even within biology more generally, for acceptable accuracy. We stress that these are merely arbitrarily chosen levels of confidence and accuracy that we use to demonstrate our method; the analyst could choose any values of accuracy and confidence that are meaningful to their study.

We use random resampling with replacement of a single craniometric variable from a commonly used large data set comprising modern human population samples from around the world to validate our methods for estimating the probability that a sample estimate is within a specified fraction of the population parameter. We extend this method of validation to include the cv. In the following report, we are not concerned with effect size, formal comparisons of sample estimates, or test power. Instead, we are concerned only with assessing how well small sample estimates of univariate variation approximate the true population parameter.

留言 (0)

沒有登入
gif