From means to meaning in the study of sex/gender differences and similarities

«It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail».

Maslow, AH. The Psychology of science: A reconnaissance. 1966

«Statistics offers a toolbox of methods, not just a single hammer […] Statistical thinking involves analyzing the problem at hand and then selecting the best tool in the statistical toolbox or even constructing such a tool».

Gigerenzer et al. The null ritual, 2004

Sex and gender (S/G)1-related factors contribute to individual variability in physiology and behavior, and a S/G-biased prevalence, manifestation, and progression for many somatic, psychiatric, and neurological diseases and disorders hasbeen reported (Altemus et al., 2014, Mauvais-Jarvis et al., 2020, Pinares-Garcia et al., 2018). Therefore, it is currently thought that incorporating S/G-related factors in research and data analysis may be crucial for the understanding of these diseases and for advancing precision medicine and refining diagnostic and treatment strategies in healthcare (Bartz et al., 2020, Stachenfeld and Mazure, 2022). Accordingly, funding agencies in the EU, US, Canada, and other geographical regions have implemented recommendations and mandates to promote or ensure the incorporation of S/G-related factors in biomedical preclinical and clinical research (White et al., 2021).

However, the effectiveness of these policies to improve current knowledge about the role of S/G-related factors in health and disease will critically depend not only on the number of studies addressing these factors, but also of their quality and methodological soundness (Rich-Edwards et al., 2018, Rich-Edwards and Maney, 2023). Specifically, ensuring rigorous research practices is imperative for drawing reliable conclusions about the exact role and quantitative contribution of S/G-related factors. In this regard, several recent studies (Galea et al., 2020, Garcia-Sifuentes and Maney, 2021, Rechlin et al., 2022) have confirmed a progressive increase in the number of studies including both male and female research subjects, but they have also identified some important methodological deficiencies, such as the omission of sample size, an imbalanced use of males and females, or failing to test formally for S/G effects. This commentary tries to bring attention to another methodological concern of these studies: the overreliance on mean comparisons using classic parametric tests (e.g., Student’s t-tests and ANOVAs).

The overreliance on mean comparisons is not exclusive of S/G-related studies, but observed across most domains of social, behavioral, and biological sciences. For example, recent systematic reviews indicate that t-tests and/or ANOVAs are used in 84.5 % of physiology studies (Weissgerber et al., 2018) and up to 12.8 times more frequently than their nonparametric counterparts in psychological research (Blanca et al., 2018). These classic parametric methods are widely used because they are the methods most frequently taught (Aiken et al., 2008, Cobb, 2007, Kline, 2013), and the reason why they are so frequently taught is because they are the most commonly used. This is problematic for at least four reasons: 1) These methods are usually taught, learned and put in practice dogmatically, hence replacing statistical thinking by an automatized testing strategy that pays little attention to the tests’ assumptions and that frequently misinterprets the tests’ results (Gigerenzer et al., 2004, Hoekstra et al., 2012, Kline, 2013); 2) ANOVAs and t-tests s operate under assumptions that are rarely met, exhibiting low power and providing unsatisfactory/ misleading results when these assumptions are violated (Rousselet et al., 2017, Wilcox, 1998); 3) Even when their assumptions are met, parametric methods comparing means can be of limited informative value (Rousselet et al., 2017, Wilcox and Keselman, 2003); 4) Statistics have much more to offer to researchers than simple average comparisons (Gigerenzer et al., 2004, Wilcox, 2023, Wilcox, 2022), but these new methods have not been incorporated to the statistics curriculum of most researchers (Cobb, 2007).

While the overreliance on mean comparisons pervades scientific research, its impact is particularly pronounced in S/G-related studies. This commentary highlights three levels of limitations associated to mean comparisons in this research domain: first, general issues stemming from assumptions and misinterpretations of classic methods comparing means (2.1 Normality and the mean, 2.2 When mean comparisons are meaningless (an example and a brief description of some robust alternatives), 2.3.1 Common misinterpretations of p-values in the context of mean comparisons); second, challenges related to the representativeness of means, which are especially pertinent in the case of large, non-randomly-assigned groups such as S/G-related categories (Section 2.3.2); and third, the categorical model imposed by means and mean comparisons that hinders the goal of incorporating S/G-related factors for the understanding of disorders and diseases and the development of individualized treatments (Section 3.1). In response to these problems and limitations, alternative analytical strategies are also briefly introduced.2 Particular attention is paid to a statistical method (the shift function; section 3.2) that allows a non-binary treatment of S/G-related information, even when this information is collected as obtained from two categories, and that seems more promising to achieve the goals of S/G-related biobehavioral research

In conclusion, by unveiling the methodological and conceptual limitations of mean comparisons and proposing alternative strategies, this commentary aims to inspire a more nuanced statistical approach in S/G-related biomedical and behavioral studies. From our viewpoint, embracing appropriate, diverse, and informatively rich analytical tools is a key step to unlocking the full potential of S/G-related factors in disease understanding, treatment refinement, and individualized healthcare.

留言 (0)

沒有登入
gif