What Research Say About Gender Differences

August 10, 2017

+STEM

The Google Memo: What Does the Research Say About Gender Differences?

This is our main post on the the “Google memo.” We have also put up two supplemental posts: 1) The Most Authoritative Review Paper on Gender Differences, and 2) The Greater Male Variability Hypothesis. But start here.

The recent Google Memo on diversity, and the immediate firing of its author, James Damore, have raised a number of questions relevant to the mission of Heterodox Academy. Large corporations deal with many of the same issues that we wrestle with at universities, such as how to seek truth and achieve the kinds of diversity we want, being cognizant that we are tribal creatures often engaged in motivated reasoning, operating within organizations that are at risk of ideological polarization.

Eventually we’ll write a separate post on the broader issue of the value of viewpoint diversity at Google, and in corporations in general. But first, in this post, we address the central empirical claim of Damore’s memo, which is contained in its second sentence. Let us quote the first three sentences:

I value diversity and inclusion, am not denying that sexism exists, and don’t endorse using stereotypes. When addressing the gap in representation in the population, we need to look at population level differences in distributions. If we can’t have an honest discussion about this, then we can never truly solve the problem.

The heart of Damore’s memo is a section titled “Possible non-bias causes of the gender gap in tech.” Damore argues that there are “population level differences” between men and women in some psychological or behavioral traits that might influence people’s career choices, and their success in those careers. He illustrates his basic framework for looking at potential “population differences” with this figure:

Damore challenges the way that Google is currently pursuing diversity–with a heavy emphasis on implicit bias training–and its assumption that gender gaps necessarily show the existence of some form of bias. Damore argues that a company that was completely free of bias and discrimination would not end up with a 50/50 gender split in all job functions because there are population differences in some traits that might influence the jobs men and women seek out and succeed at. His memo is structured as an argument against a position he refers to as “the extreme stance that all differences in outcome are due to differential treatment.”

Is Damore correct that such “population level differences” exist? It’s very hard to evaluate empirical claims about politicized topics because everyone can “cherry pick” the studies that support their side (for longer discussions, see here and here). The best way to establish the truth in such cases is to examine meta-analyses, which are studies that integrate the findings from many other studies.

We list all the relevant meta-analyses and large sample studies we have found so far in section 2, below, along with their abstracts. But first, in section 1, we collect all the commentary we can find from experts who are writing about the Google memo specifically. And finally, in section 3, we give our own views about how to make sense of the complicated and conflicting set of research findings. If you think we have left out any major experts or meta-analyses, please let us know in the comments at the end, and if appropriate we will add it to this list. We intend this post to be a living document that brings together in one place the best empirically grounded arguments on all sides. It will be updated regularly.

We focus here on research on sex differences in interests, traits, and abilities that might be related to coding/engineering/STEM. We do not address Damore’s claims about sex differences in traits said to be related to leadership abilities. Leadership is a messy topic, in part because there are many styles of leadership. See Eagly & Johnson, 1990, for a review of sex differences in that literature, and Eagly, Johannesen-Schmidt, & van Engen, 2003for a meta-analysis of gender and leadership style.

In this review, we also do not address Damore’s claims that some gender differences are rooted in biological factors, such as the effect of prenatal hormones on brain development. Meta-analyses cannot tell us the origins of differences. Most researchers studying these questions assume that biology, childhood socialization, and current context interact in complex ways, and most psychologists know that pointing to a biological contribution (such as a genetic or hormonal influence) does not mean that an effect is “hard wired,” unmalleable, or immune to contextual variables (see Eagly & Wood, 2012; this is a point that Damore did not acknowledge). In this review we focus only on whether “population level differences” exist. (See this essay on why it is mostly claims other than this one that have generated most of the outrage.) A company like Google must hire from the existing population of adults. Google and other tech companies can surely take steps that will influence the next generation of boys and girls, but to make progress toward its diversity goals Google must have an accurate understanding of the current population of men and women from which it is trying to recruit. Do population level differences exist between men and women?

1) CURRENT COMMENTARY ON DAMORE’S MEMO

A) GENERALLY SUPPORTIVE

Here are the experts who have said that Damore’s main assertions about gender differences are, for the most part, correct (and backed up their arguments with citations).

Lee Jussim, David Schmitt (see also here), Geoffrey Miller, and Deborah Soh (see also here), at Quillette: Google Memo: Four scientists respond.
David Geary: Straight talk about sex differences in occupational choices and work-family tradeoffs.
Gregg Henriques: An in-depth analysis of the crisis at Google.
[email us to direct us to more]

B) GENERALLY CRITICAL

Here are the experts who have written that the memo’s assertions about gender differences are, for the most part, wrong (and backed up their arguments with citations):

Adam Grant: Differences between Men and Women are Vastly Exaggerated [But see this critique from Scott Alexander at Slate Star Codex (a psychiatrist who often writes on social science topics); And then see Grant’s response here.]
Suzanne Sadedin: A scientist’s take on the biological claims from the infamous Google anti-diversity manifesto.
Agustin Fuentes: The “Google manifesto”: Bad biology, ignorance of evolutionary processes, and privilege.
Stefanie Johnson: What the Science Actually Says About Gender Gaps in the Workplace
[email us to direct us to more]

C) IN BETWEEN

Here are the experts who have written that the memo’s assertions (that gender differences exist and that biology plays a role) are correct, but are interpreted overly simplistically to reach incorrect or premature conclusions.

Alice Eagly, Does biology explain why men outnumber women in tech?

2) META-ANALYSES AND LARGE SAMPLE STUDIES OF GENDER DIFFERENCES

Meta-analysis is a method of examining the effects found (or not found) in dozens or hundreds of studies, converting the effect sizes to a common scale, and then finding the average across all the studies. It’s a very powerful technique that allows researchers to examine questions such as: Does the effect get larger or smaller as we limit our analysis to only the best-done studies? What broad statements can be made about a body of literature?

Although meta-analysis is a powerful technique, it is not perfect (for an overview of strengths and weaknesses see Rosenthal & DiMatteo, 2001). It would be ideal if a researcher could not only identify, but also obtain all of the relevant data on the phenomenon of interest. However, this is an impossible task for any single meta-analysis to achieve. Statistically significant findings are more likely to be published (see Rosenthal, 1979), and thus included in meta-analyses, compared to null findings which often remain unpublished. No single meta-analysis will be able to identify all of the relevant studies. This is why we have decided to bring together many meta-analyses in one place.

We have included relevant meta-analyses on sex differences in interests, personality traits, behaviors and abilities that might be related to coding/engineering from 1990 to the present. We also included large cross-national empirical investigations (N > 15,000) and large sample empirical investigations (N > 10,000) of gender differences. Again, we acknowledge that the evidence we present is incomplete; this is a first pass, which we will update with the input and help of others. (Please add citations in the comments section, or email them to stevens at heterodoxacademy dot org).

We show findings that generally support Damore’s claims in green, and findings that generally oppose his claims (or support his critics) in red. Effect sizes (d) are measures of how far apart two group means are, expressed as a proportion of the standard deviation (averaged between the two groups). By convention, an effect is considered trivially small if d is below .20, small if d is greater than or equal to .20, medium if d is greater than or equal to .50, and large if d is greater than or equal to .80.

Byrnes, J.P., Miller, D.C., & Schafer, W.D. (1999). Gender differences in risk-taking: A meta-analysis. Psychological Bulletin, 125(3), 367-383.	The authors conducted a meta-analysis of 150 studies in which the risk-taking tendencies of male and female participants were compared. Studies were coded with respect to type of task (e.g., self-reported behaviors vs. observed behaviors), task content (e.g., smoking vs. sex), and 5 age levels. Results showed that the average effects for 14 out of 16 types of risk-taking were significantly larger than 0 (indicating greater risk taking in male participants) and that nearly half of the effects were greater than .20. However, certain topics (e.g., intellectual risk- taking and physical skills) produced larger gender differences than others (e.g., smoking). In addition, the authors found that (a) there were significant shifts in the size of the gender gap between successive age levels, and (b) the gender gap seems to be growing smaller over time. The discussion focuses on the meaning of the results for theories of risk-taking and the need for additional studies to clarify age trends. [HxA note: the weighted mean effect size was very small — d=.13. Other studies have shown that women take more risks than men on some kinds of risk, e.g., living organ donation; see Becker & Eagley, 2004. Thanks to Alice Eagly for this note]
Costa Jr., P.T., Terracciano, A., & McCrae, R.R. (2001). Gender differences in personality traits across cultures: Robust and surprising findings. Journal of Personality and Social Psychology, 81(2), 322-331.	Secondary analyses of Revised NEO Personality Inventory data from 26 cultures (N = 23,031) suggest that gender differences are small relative to individual variation within genders; differences are replicated across cultures for both college-age and adult samples, and differences are broadly consistent with gender stereotypes: Women reported themselves to be higher in Neuroticism, Agreeableness, Warmth, and Openness to Feelings, whereas men were higher in Assertiveness and Openness to Ideas. Contrary to predictions from evolutionary theory, the magnitude of gender differences varied across cultures. Contrary to predictions from the social role model, gender differences were most pronounced in European and American cultures in which traditional sex roles are minimized. Possible explanations for this surprising finding are discussed, including the attribution of masculine and feminine behaviors to roles rather than traits in traditional cultures.
Del Giudice, M. Booth, T., & Irwing, P. (2012). The distance between Mars and Venus: Measuring global sex differences in personality. PLos ONE, 7(1): e29265. https://doi.org/10.1371/journal.pone.0029265	Background Sex differences in personality are believed to be comparatively small. However, research in this area has suffered from significant methodological limitations. We advance a set of guidelines for overcoming those limitations: (a) measure personality with a higher resolution than that afforded by the Big Five; (b) estimate sex differences on latent factors; and (c) assess global sex differences with multivariate effect sizes. We then apply these guidelines to a large, representative adult sample, and obtain what is presently the best estimate of global sex differences in personality. Methodology/Principal Findings Personality measures were obtained from a large US sample (N = 10,261) with the 16PF Questionnaire. Multigroup latent variable modeling was used to estimate sex differences on individual personality dimensions, which were then aggregated to yield a multivariate effect size (Mahalanobis D). We found a global effect size D = 2.71, corresponding to an overlap of only 10% between the male and female distributions. Even excluding the factor showing the largest univariate ES, the global effect size was D = 1.71 (24% overlap). These are extremely large differences by psychological standards. Significance The idea that there are only minor differences between the personality profiles of males and females should be rejected as based on inadequate methodology. [HxA NOTE: This study computes multivariate effect sizes. For more information on multivariate effects sizes please see here, here, and here]
Else-Quest, N.M., Hyde, J.S., Goldsmith, H.H., Van Hulle, C.A. (2006). Gender differences in temperament: A meta-analysis. Psychological Bulletin, 132(1), 33-72.	The authors used meta-analytical techniques to estimate the magnitude of gender differences in mean level and variability of 35 dimensions and 3 factors of temperament in children ages 3 months to 13 years. Effortful control showed a large difference favoring girls and the dimensions within that factor (e.g., inhibitory control: d = 0.41, perceptual sensitivity: d = 0.38) showed moderate gender differences favoring girls, consistent with boys’ greater incidence of externalizing disorders. Surgency showed a difference favoring boys, as did some of the dimensions within that factor (e.g., activity: d = 0.33, high-intensity pleasure: d = 0.30), consistent with boys’ greater involvement in active rough-and-tumble play. Negative affectivity showed negligible gender differences. [HxA note: Damore’s memo is not about male superiority; it is about population differences that might explain why we find gender gaps in many occupations, sometimes favoring women, and that might point to ways to make coding more attractive to women]
Else-Quest, N.M, Hyde, J.S., & Linn, M.C. (2010). Cross-national patterns of gender differences in mathematics: A meta-analysis. Psychological Bulletin, 136(1), 103-127.	A gender gap in mathematics achievement persists in some nations but not in others. In light of the underrepresentation of women in careers in science, technology, mathematics, and engineering, increasing research attention is being devoted to understanding gender differences in mathematics achievement, attitudes, and affect. The gender stratification hypothesis maintains that such gender differences are closely related to cultural variations in opportunity structures for girls and women. We meta-analyzed 2 major international data sets, the 2003 Trends in International Mathematics and Science Study and the Programme for International Student Assessment, representing 493,495 students 14-16 years of age, to estimate the magnitude of gender differences in mathematics achievement, attitudes, and affect across 69 nations throughout the world. Consistent with the gender similarities hypothesis, all of the mean effect sizes in mathematics achievement were very small (d < 0.15); however, national effect sizes showed considerable variability (ds = -0.42 to 0.40). Despite gender similarities in achievement, boys reported more positive math attitudes and affect (ds = 0.10 to 0.33); national effect sizes ranged from d = -0.61 to 0.89. In contrast to those of previous tests of the gender stratification hypothesis, our results point to specific domains of gender equity responsible for gender gaps in math. Gender equity in school enrollment, women’s share of research jobs, and women’s parliamentary representation were the most powerful predictors of cross-national variability in gender gaps in math.Results are situated within the context of existing research demonstrating apparently paradoxical effects of societal gender equity and highlight the significance of increasing girls’ and women’s agency cross-nationally. [HxA NOTE: “paradoxical” refers to the finding that the more gender-equal societies, such as in Scandinavia, show LARGER gender differences in math attitudes, consistent with the idea that freedom allows men and women to express differing desires]
Hyde, J.S. (2005). The gender similarities hypothesis. American Psychologist, 60(6), 581-592.	The differences model, which argues that males and females are vastly different psychologically, dominates the popular media. Here, the author advances a very different view, the gender similarities hypothesis, which holds that males and females are similar on most, but not all, psychological variables. Results from a review of 46 meta-analyses support the gender similarities hypothesis. Gender differences can vary substantially in magnitude at different ages and depend on the context in which measurement occurs. Overinflated claims of gender differences carry substantial costs in areas such as the workplace and relationships. [HxA NOTE: Men and women are clearly similar on most psychological constructs, but Hyde’s table of results shows there are a few domains with medium to large differences that could be relevant to the Damore memo, including mechanical reasoning, mental rotation, and spatial visualization]
Hyde, J.S., Fennema, E., & Lamon, S.J. (1990). Gender differences in mathematics performance: A meta-analysis. Psychological Bulletin, 107(2), 139-155.	Reviewers have consistently concluded that males perform better on mathematics tests than females do. To make a refined assessment of the magnitude of gender differences in mathematics performance, we performed a meta-analysis of 100 studies. They yielded 254 independent effect sizes, representing the testing of 3,175,188 Ss. Averaged over all effect sizes based on samples of the general population, d was -0.05, indicating that females outperformed males by only a negligible amount. For computation, d was -0.14 (the negative value indicating superior performance by females). For understanding of mathematical concepts, d was -0.03; for complex problem solving, d was 0.08. An examination of age trends indicated that girls showed a slight superiority in computation in elementary school and middle school. There were no gender differences in problem-solving in elementary or middle school; differences favoring men emerged in high school (d = 0.29) and in college (d = 0.32). Gender differences were smallest and actually favored females in samples of the general population, grew larger with increasingly selective samples, and were largest for highly selected samples and samples of highly precocious persons. The magnitude of the gender difference has declined over the years; for studies published in 1973 or earlier d was 0.31, whereas it was 0.14 for studies published in 1974 or later. We conclude that gender differences in mathematics performance are small. Nonetheless, the lower performance of women in problem-solving that is evident in high school requires attention.
Hyde, J.S., Lindberg, S.M., Linn, M.C., Ellis, A.B., & Williams, C.C. (2008). Gender similarities characterize math performance. Science, 321(5888), 494-495.	[HxA Note: There is no abstract for this paper, we therefore present the conclusion] Our analysis shows that, for grades 2 to 11, the general population no longer shows a gender difference in math skills, consistent with the gender similarities hypothesis (19). There is evidence of slightly greater male variability in scores, although the causes remain unexplained. Gender differences in math performance, even among high scorers, are insufficient to explain lopsided gender patterns in participation in some STEM fields. An unexpected finding was that state assessments designed to meet NCLB requirements fail to test complex problem-solving of the kindneeded for success in STEM careers, a lacuna that should be fixed.
Lindberg, S.M., Hyde, J.S., Petersen, J.L., & Linn, M.C. (2010). New trends in gender and mathematics performance: A meta-analysis. Psychological Bulletin, 136(6), 1123-1135.	In this paper, we use meta-analysis to analyze gender differences in recent studies of mathematics performance. First, we meta-analyzed data from 242 studies published between 1990 and 2007, representing the testing of 1,286,350 people. Overall, d = .05, indicating no gender difference, and VR = 1.08, indicating nearly equal male and female variances. Second, we analyzed data from large data sets based on probability sampling of U.S. adolescents over the past 20 years: the NLSY, NELS88, LSAY, and NAEP. Effect sizes for the gender difference ranged between −0.15 and +0.22. Variance ratios ranged from 0.88 to 1.34. Taken together these findings support the view that males and females perform similarly in mathematics.
Lippa, R.A. (2010). Sex differences in personality traits and gender-related occupational preferences across 53 nations: Testing evolutionary and social-environmental theories. Archives of Sexual Behavior 39(3), 619-636.	Using data from over 200,000 participants from 53 nations, I examined the cross-cultural consistency of sex differences for four traits: extraversion, agreeableness, neuroticism, and male-versus-female-typical occupational preferences. Across nations, men and women differed significantly on all four traits (mean ds = -.15, -.56, -.41, and 1.40, respectively, with negative values indicating women scoring higher). The strongest evidence for sex differences in SDs was for extraversion (women more variable) and for agreeableness (men more variable). United Nations indices of gender equality and economic development were associated with larger sex differences in agreeableness, but not with sex differences in other traits. Gender equality and economic development were negatively associated with mean national levels of neuroticism, suggesting that economic stress was associated with higher neuroticism. Regression analyses explored the power of sex, gender equality, and their interaction to predict men’s and women’s 106 national trait means for each of the four traits. Only sex predicted means for all four traits, and sex predicted trait means much more strongly than did gender equality or the interaction between sex and gender equality. These results suggest that biological factors may contribute to sex differences in personality and that culture plays a negligible to small role in moderating sex differences in personality. [HxA NOTE: the correlation of agreeableness with gender equality was “paradoxical” — larger in more gender-equal societies]
Lippa, R.A. (2010). Gender differences in personality and interests: When, where, and why? Social Psychological and Personality Compass, 4, 1098-1110.	How big are gender differences in personality and interests, and how stable are these differences across cultures and over time? To answer these questions, I summarize data from two meta-analyses and three cross-cultural studies on gender differences in personality and interests. Results show that gender differences in Big Five personality traits are ‘small’ to ‘moderate,’ with the largest differences occurring for agreeableness and neuroticism (respective ds = 0.40 and 0.34; women higher than men). In contrast, gender differences on the people–things dimension of interests are ‘very large’ (d = 1.18), with women more people-oriented and less thing-oriented than men. Gender differences in personality tend to be larger in gender-egalitarian societies than in gender-inegalitarian societies, a finding that contradicts social role theory but is consistent with evolutionary, attributional, and social comparison theories. In contrast, gender differences in interests appear to be consistent across cultures and over time, a finding that suggests possible biologic influences.
Lippa, R.A., Collaer, M.L., & Peters, M. (2010). Sex differences in mental rotation and line angle judgments are positively associated with gender equality and economic development across 53 nations. Archives of Sexual Behavior, 39(4), 990-997.	Mental rotation and line angle judgment performance were assessed in more than 90,000 women and 111,000 men from 53 nations. In all nations, men’s mean performance exceeded women’s on these two visuospatial tasks. Gender equality (as assessed by United Nations indices) and economic development (as assessed by per capita income and life expectancy) were significantly associated, across nations, with larger sex differences, contrary to the predictions of social role theory. For both men and women, across nations, gender equality and economic development were significantly associated with better performance on the two visuospatial tasks. However, these associations were stronger for the mental rotation task than for the line angle judgment task, and they were stronger for men than for women. Results were discussed in terms of evolutionary, social role, and stereotype threat theories of sex differences.
Lytton, H. & Romney, D.M. (1991). Parents’ differential socialization of boys and girls: A meta-analysis. Psychological Bulletin, 109(2), 267-296.	A meta-analysis of 172 studies attempted to resolve the conflict between previous narrative reviews on whether parents make systematic differences in their rearing of boys and girls. Most effect sizes were found to be nonsignificant and small. In North American studies, the only socialization area of 19 to display a significant effect for both parents is encouragement of sex-typed activities. In other Western countries, physical punishment is applied significantly more to boys. Fathers tend to differentiate more than mothers between boys and girls. Over all socialization areas, effect size is not related to sample size or year of publication. Effect size decreases with child’s age and increases with higher quality. No grouping by any of these variables changes a nonsignificant effect to a significant effect. Because little differential socialization for social behavior or abilities can be found, other factors that may explain the genesis of documented sex differences are discussed.
Morris, M.L. (2016). Vocational interests in the United States: Sex, age, ethnicity, and year effects. Journal of Counseling Psychology, 63(5), 604-615.	Vocational interests predict educational and career choices, job performance, and career success (Rounds & Su, 2014). Although sex differences in vocational interests have long been observed (Thorndike, 1911), an appropriate overall measure has been lacking from the literature. Using a cross-sectional sample of United States residents aged 14 to 63 who completed the Strong Interest Inventory assessment between 2005 and 2014 (N = 1,283,110), I examined sex, age, ethnicity, and year effects on work related interest levels using both multivariate and univariate effect size estimates of individual dimensions (Holland’s Realistic, Investigative, Artistic, Social, Enterprising, and Conventional). Men scored higher on Realistic (d = -1.14), Investigative (d = -.32), Enterprising (d = -.22), and Conventional (d = -.23), while women scored higher on Artistic (d = .19) and Social (d = .38), mostly replicating previous univariate findings. Multivariate, overall sex differences were very large (disattenuated Mahalanobis’ D = 1.61; 27% overlap). Interest levels were slightly lower and overall sex differences larger in younger samples. Overall sex differences have narrowed slightly for 18-22 year-olds in more recent samples. Generally very small ethnicity effects included relatively higher Investigative and Enterprising scores for Asians, Indians, and Middle Easterners, lower Realistic scores for Blacks and Native Americans, higher Realistic, Artistic, and Social scores for Pacific Islanders, and lower Conventional scores for Whites. Using Prediger’s (1982) model, women were more interested in people (d = 1.01) and ideas (d = .18), while men were more interested in things and data. These results, consistent with previous reviews showing large sex differences and small year effects, suggest that large sex differences in work related interests will continue to be observed for decades.
Richard, F.D., Bond, Jr., C.F., & Stokes-Zoota, J.J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331-363.	This article compiles results from a century of social psychological research, more than 25,000 studies of 8 million people. A large number of social psychological conclusions are listed alongside meta-analytic information about the magnitude and variability of the corresponding effects. References to 322 meta-analyses of social psychological phenomena are presented, as well as statistical effect-size summaries. Analyses reveal that social psychological effects typically yield a value of r equal to .21 and that, in the typical research literature, effects vary from study to study in ways that produce a standard deviation in r of.15. Uses, limitations, and implications of this large-scale compilation are noted. [HxA NOTE: Richard et al. used meta-analysis to investigate a wide range of social psychological effects. They reported a d = 0.26 for sex differences, an effect size that was smaller than the average effect size for social psychology as a whole (d = 0.46)]
Schmitt, D.P., Realo, A., Voracek, M., & Allik, J. (2008). Why can’t a man be more like a woman? Sex differences in Big Five personality traits across 55 cultures. Journal of Personality and Social Psychology, 94(1), 168-182.	Previous research suggested that sex differences in personality traits are larger in prosperous, healthy, and egalitarian cultures in which women have more opportunities equal with those of men. In this article, the authors report cross-cultural findings in which this unintuitive result was replicated across samples from 55 nations (N = 17,637). On responses to the Big Five Inventory, women reported higher levels of neuroticism, extraversion, agreeableness, and conscientiousness than did men across most nations. These findings converge with previous studies in which different Big Five measures and more limited samples of nations were used. Overall, higher levels of human development–including long and healthy life, equal access to knowledge and education, and economic wealth–were the main nation-level predictors of larger sex differences in personality. Changes in men’s personality traits appeared to be the primary cause of sex difference variation across cultures. It is proposed that heightened levels of sexual dimorphism result from personality traits of men and women being less constrained and more able to naturally diverge in developed nations. In less fortunate social and economic conditions, innate personality differences between men and women may be attenuated. Overall, higher levels of human development–including long and healthy life, equal access to knowledge and education, and economic wealth–were the main nation-level predictors of larger sex differences in personality. Changes in men’s personality traits appeared to be the primary cause of sex difference variation across cultures. It is proposed that heightened levels of sexual dimorphism result from personality traits of men and women being less constrained and more able to naturally diverge in developed nations. In less fortunate social and economic conditions, innate personality differences between men and women may be attenuated.
Stoet, G. & Geary, D.C. (2013). Sex differences in mathematics and reading achievement are inversely related: Within- and across-nation assessment of 10 years of PISA data. PLoS ONE 8(3): e57988. https://doi.org/10.1371/journal.pone.0057988	We analyzed one decade of data collected by the Programme for International Student Assessment (PISA), including the mathematics and reading performance of nearly 1.5 million 15 year olds in 75 countries. Across nations, boys scored higher than girls in mathematics, but lower than girls in reading. The sex difference in reading was three times as large as in mathematics. There was considerable variation in the extent of the sex differences between nations. There are countries without a sex difference in mathematics performance, and in some countries girls scored higher than boys. Boys scored lower in reading in all nations in all four PISA assessments (2000, 2003, 2006, 2009). Contrary to several previous studies, we found no evidence that the sex differences were related to nations’ gender equality indicators. Further, paradoxically, sex differences in mathematics were consistently and strongly inversely correlated with sex differences in reading: Countries with a smaller sex difference in mathematics had a larger sex difference in reading and vice versa. We demonstrate that this was not merely a between-nation, but also a within-nation effect. This effect is related to relative changes in these sex differences across the performance continuum: We did not find a sex difference in mathematics among the lowest performing students, but this is where the sex difference in reading was largest. In contrast, the sex difference in mathematics was largest among the higher performing students, and this is where the sex difference in reading was smallest. The implication is that if policy makers decide that changes in these sex differences are desired, different approaches will be needed to achieve this for reading and mathematics. Interventions that focus on high-achieving girls in mathematics and on low achieving boys in reading are likely to yield the strongest educational benefits.
Su, R., Rounds, J., & Armstrong, P.I. (2009). Men and things, women and people: A meta-analysis of sex differences in interests. Psychological Bulletin, 135(6), 859-884.	The magnitude and variability of sex differences in vocational interests were examined in the present meta-analysis for Holland’s (1959, 1997) categories (Realistic, Investigative, Artistic, Social, Enterprising, and Conventional), Prediger’s (1982) Things-People and Data-Ideas dimensions, and the STEM (science, technology, engineering, and mathematics) interest areas. Technical manuals for 47 interest inventories were used, yielding 503,188 respondents. Results showed that men prefer working with things and women prefer working with people, producing a large effect size (d = 0.93) on the Things-People dimension. Men showed stronger Realistic (d = 0.84) and Investigative (d = 0.26) interests, and women showed stronger Artistic (d = -0.35), Social (d = -0.68), and Conventional (d = -0.33) interests. Sex differences favoring men were also found for more specific measures of engineering (d = 1.11), science (d = 0.36), and mathematics (d = 0.34) interests. Average effect sizes varied across interest inventories, ranging from 0.08 to 0.79. The quality of interest inventories, based on professional reputation, was not differentially related to the magnitude of sex differences. Moderators of the effect sizes included interest inventory item development strategy, scoring method, theoretical framework, and sample variables of age and cohort. Application of some item development strategies can substantially reduce sex differences. The present study suggests that interests may play a critical role in gendered occupational choices and gender disparity in the STEM fields.
Su, R. & Rounds, J. (2015). All STEM fields are not created equal: People and things interests explains gender disparities across fields. Frontiers in Psychology, 6: 189.	The degree of women’s underrepresentation varies by STEM fields. Women are now overrepresented in social sciences, yet only constitute a fraction of the engineering workforce. In the current study, we investigated the gender differences in interests as an explanation for the differential distribution of women across sub-disciplines of STEM as well as the overall underrepresentation of women in STEM fields. Specifically, we meta-analytically reviewed norm data on basic interests from 52 samples in 33 interest inventories published between 1964 and 2007, with a total of 209,810 male and 223,268 female respondents. We found gender differences in interests to vary largely by STEM field, with the largest gender differences in interests favoring men observed in engineering disciplines (d = 0.83–1.21), and in contrast, gender differences in interests favoring women in social sciences and medical services (d = −0.33 and −0.40, respectively). Importantly, the gender composition (percentages of women) in STEM fields reflects these gender differences in interests. The patterns of gender differences in interests and the actual gender composition in STEM fields were explained by the people-orientation and things-orientation of work environments, and were not associated with the level of quantitative ability required. These findings suggest potential interventions targeting interests in STEM education to facilitate individuals’ ability and career development and strategies to reform work environments to better attract and retain women in STEM occupations.
Uttal, D.H., Meadow, N.G., Tipton, E., Hand, L.L., Alden, A.R., Warren, C., & Newcombe, N.S. (2013). The malleability of spatial skills: A meta-analysis of training studies. Psychological Bulletin, 139(2), 352-402.	Having good spatial skills strongly predicts achievement and attainment in science, technology, engineering, and mathematics fields (e.g., Shea, Lubinski, & Benbow, 2001; Wai, Lubinski, & Benbow, 2009). Improving spatial skills is therefore of both theoretical and practical importance. To determine whether and to what extent training and experience can improve these skills, we meta-analyzed 217 research studies investigating the magnitude, moderators, durability, and generalizability of training on spatial skills. After eliminating outliers, the average effect size (Hedges’s g) for training relative to control was 0.47 (SE = 0.04). Training effects were stable and were not affected by delays between training and posttesting. Training also transferred to other spatial tasks that were not directly trained. We analyzed the effects of several moderators, including the presence and type of control groups, sex, age, and type of training. Additionally, we included a theoretically motivated typology of spatial skills that emphasizes 2 dimensions: intrinsic versus extrinsic and static versus dynamic (Newcombe & Shipley, in press). Finally, we consider the potential educational and policy implications of directly training spatial skills. Considered together, the results suggest that spatially enriched education could pay substantial dividends in increasing participation in mathematics, science, and engineering.
Voyer, D., & Voyer, S. D. (2014). Gender differences in scholastic achievement: A meta-analysis. Psychological Bulletin, 140(4), 1174-1204.	A female advantage in school marks is a common finding in education research, and it extends to most course subjects (e.g., language, math, science), unlike what is found on achievement tests. However, questions remain concerning the quantification of these gender differences and the identification of relevant moderator variables. The present meta-analysis answered these questions by examining studies that included an evaluation of gender differences in teacher-assigned school marks in elementary, junior/middle, or high school or at the university level (both undergraduate and graduate). The final analysis was based on 502 effect sizes drawn from 369 samples. A multilevel approach to meta-analysis was used to handle the presence of nonindependent effect sizes in the overall sample. This method was complemented with an examination of results in separate subject matters with a mixed-effects meta-analytic model. A small but significant female advantage (mean d = 0.225, 95% CI [0.201, 0.249]) was demonstrated for the overall sample of effect sizes. Noteworthy findings were that the female advantage was largest for language courses (mean d = 0.374, 95% CI [0.316, 0.432]) and smallest for math courses (mean d = 0.069, 95% CI [0.014, 0.124]). Source of marks, nationality, racial composition of samples, and gender composition of samples were significant moderators of effect sizes. Finally, results showed that the magnitude of the female advantage was not affected by year of publication, thereby contradicting claims of a recent “boy crisis” in school achievement. The present meta-analysis demonstrated the presence of a stable female advantage in school marks while also identifying critical moderators. Implications for future educational and psychological research are discussed.
Voyer, D., Voyer, S., & Bryden, M.P. (1995). Magnitude of sex differences in spatial abilities: A meta-analysis and consideration of critical variables. Psychological Bulletin, 117(2), 250-270.	In recent years, the magnitude, consistency, and stability across time of cognitive sex differences have been questioned. The present study examined these issues in the context of spatial abilities. A meta-analysis of 286 effect sizes from a variety of spatial ability measures was conducted. Effect sizes were partitioned by the specific test used and by a number of variables related to the experimental procedure in order to achieve homogeneity. Results showed that sex differences are significant in several tests but that some intertest differences exist. Partial support was found for the notion that the magnitude of sex differences has decreased in recent years. Finally, it was found that the age of emergence of sex differences depends on the test used. Results are discussed with regard to their implications for the study of sex differences in spatial abilities.
Weinburgh, M. (1995). Gender differences in student attitudes towards science: A meta-analysis of the literature from 1970 to 1991. Journal of Research in Science Training, 32(4), 387-398.	A meta-analysis covering the literature between 1970 and 1991 was conducted using an approach similar to that suggested by Glass, McGaw, and Smith (1981) and Hedges, Shymansky, and Woodworth (1989). This analysis examined gender differences in student attitudes toward science, and correlations between attitudes toward science and achievement in science. Thirty-one effect sizes and seven correlations representing the testing of 6,753 subjects were found in 18 studies. The mean of the unweighted effect sizes was .20 (SD = .50) and the mean of the weighted effect size was .16 (SD = .50), indicating that boys have more positive attitudes toward science than girls. The mean correlation between attitude and achievement was .50 for boys and .55 for girls, suggesting that the correlations are comparable. Results of the analysis of gender differences in attitude as a function of science type indicate that boys show a more positive attitude toward science than girls in all types of science. The correlation between attitude and achievement for boys and girls as a function of science type indicates that for biology and physics the correlation is positive for both, but stronger for girls than for boys. Gender differences and correlations between attitude and achievement by gender as a function of publication date show no pattern. The results for the analysis of gender differences as a function of the selectivity of the sample indicate that general level students reflect a greater positive attitude for boys, whereas the high-performance students indicate a greater positive attitude for girls. The correlation between attitude and achievement as a function of selectivity indicates that in all cases a positive attitude results in higher achievement. This is particularly true for low-performance girls. The implications of these finding are discussed and further research suggested.
Zell, E., Krizan, Z., & Teeter, S.R. (2015). Evaluating gender similarities and differences using metasynthesis. American Psychologist, 70(1), 10-20.	Despite the common lay assumption that males and females are profoundly different, Hyde (2005) used data from 46 meta-analyses to demonstrate that males and females are highly similar. Nonetheless, the gender similarities hypothesis has remained controversial. Since Hyde’s provocative report, there has been an explosion of meta-analytic interest in psychological gender differences. We utilized this enormous collection of 106 meta-analyses and 386 individual meta-analytic effects to reevaluate the gender similarities hypothesis. Furthermore, we employed a novel data-analytic approach called metasynthesis (Zell & Krizan, 2014) to estimate the average difference between males and females and to explore moderators of gender differences. The average, absolute difference between males and females across domains was relatively small (d = 0.21, SD = 0.14), with the majority of effects being either small (46%) or very small (39%). Magnitude of differences fluctuated somewhat as a function of the psychological domain (e.g., cognitive variables, social and personality variables, well-being), but remained largely constant across age, culture, and generations. These findings provide compelling support for the gender similarities hypothesis, but also underscore conditions under which gender differences are most pronounced. [HxA NOTE: metasynthesis differs from meta-analysis, please see the Zell & Krizan, 2014 paper for a longer discussion of metasynthesis]

3) OUR CONCLUSIONS

The research findings are complicated, as you can see from the many abstracts containing both red and green text, and from the presence on both sides of the debate of some of the top researchers in psychology. Nonetheless, we think that the situation can be greatly clarified by distinguishing abilities from interests. We think the following three statements are supported by the research reviewed above. [we have put in bold the text that has changed since our initial version of this post]

1. Gender differences in math/science ability, achievement, and performance are small or nil. (See especially the studies by Hyde; see also this review paper by Spelke, 2005). There are two exceptions to this statement:
A)Men (on average) score higher than women on most tests of spatial abilities, but the size of this advantage depends on the task and varies from small to large (e.g., Lindberg et al., 2010). There is at least one spatial task that favors females (spatial location memory; see e.g., Galea & Kimura, 1993; Kimura, 1996; Vandenberg & Kuse, 1978). Men also (on average) score higher on mechanical reasoning and tests of mathematical ability, although this latter advantage is small. Women get better grades at all levels of schooling and score higher on a few abilities that are relevant to success in any job (e.g., reading comprehension, writing, social skills). Thus, we assume that this one area of male superiority is not likely to outweigh areas of male inferiority to become a major source of differential outcomes.
B) There is good evidence that men are more variable on a variety of traits, meaning that they are over-represented at both tails of the distribution (i.e., more men at the very bottom, and at the very top), even though there is no gender difference on average. Thus, the pool of potentially qualified applicants for a company like Google is likely to contain more males than females. To be clear, this does not mean that males are more “suited” for STEM jobs. Anyone located in the upper tail of the distributions valued in the hiring process possesses the requisite skills. Although there may be fewer women in that upper tail, the ones who are found there are likely to have several advantages over the men, particularly because they likely have better verbal skills.

2. Gender differences in interest and enjoyment of math, coding, and highly “systemizing” activities are large. The difference on traits related to preferences for “people vs. things” is found consistently and is very large, with some effect sizes exceeding 1.0. (See especially the meta-analyses by Su and her colleagues, and also see this review paper by Ceci & Williams, 2015).

3. Culture and context matter, in complicated ways. Some gender differences have decreased over time as women have achieved greater equality, showing that these differences are responsive to changes in culture and environment. But the cross-national findings sometimes show “paradoxical” effects: progress toward gender equality in rights and opportunities sometimes leads to larger gender differences in some traits and career choices. Nonetheless, it seems that actions taken today by parents, teachers, politicians, and designers of tech products may increase the likelihood that girls will grow up to pursue careers in tech, and this is true whether or not biology plays a role in producing any particular population difference. (See this review paper by Eagly and Wood, 2013).

In conclusion, based on the meta-analyses we reviewed and the research on the Greater Male Variability Hypothesis, Damore is correct that there are “population level differences in distributions” of traits that are likely to be relevant for understanding gender gaps at Google and other tech firms. The differences are much larger and more consistent for traits related to interest and enjoyment, rather than ability. This distinction between interest and ability is important because it may address one of the main fears raised by Damore’s critics: that the memo itself will cause Google employees to assume that women are less qualified, or less “suited” for tech jobs, and will therefore lead to more bias against women in tech jobs. But the empirical evidence we have reviewed should have the opposite effect. Population differences in interest and population differences in variability of abilities may help explain why there are fewer women in the applicant pool, but the women who choose to enter the pool are just as capable as the larger number of men in the pool. This conclusion does not deny that various forms of bias, harassment, and discouragement exist and may contribute to outcome disparities, nor does it imply that the differences in interest are biologically fixed and cannot be changed in future generations.

If our three conclusions are correct then Damore was drawing attention to empirical findings that seem to have been previously unknown or ignored at Google, and which might be helpful to the company as it tries to improve its diversity policies and outcomes.

NOTE: This is a living blog post. We are updating it every few days in August, as commenters and colleagues guide us to new studies and offer us thoughtful criticisms. Our conclusions have changed slightly since our initial post. To reduce confusion, we have created a Google Doc that gives the original version of the post and then shows the major substantive changes we have made, particularly to the conclusions.

=======================

For further reading:

See this 2005 debate between Harvard professors Steven Pinker and Elizabeth Spelke, hosted by Edge.org, on “The Science of Gender and Science“
Pinker, S. (2002). The blank slate: The modern denial of human nature. New York, NY: Viking.
Wood, W. & Eagly, A.H. (2012). Biosocial construction of sex differences and similarities in behavior. Advances in Experimental Social Psychology, 46.
Lee (2017), I’m a woman in computer science. Let me ladysplain the Google memo to you. (a good explanation of what Damore’s defenders are missing when they take the memo’s hedges and qualifications at face value.)
Kliff (2017), The truth about the gender wage gap. (an in-depth look at the gender wage gap.)
Miller, D.I. & Halpern, D.F. (2014). The new science of cognitive sex differences. Trends in Cognitive Science, 18(1), 37-45.
Hall, J.A. (1978). Gender effects in the decoding of verbal cues. Psychological Bulletin, 85(4), 845-857.
Diekman, A.B., Clark, E.K., Johnston, A.M., Brown, E.R., & Steinberg, M. (2011). Malleability in communal goals and beliefs influences attraction to STEM careers: Evidence for a goal congruity perspective. Journal of Personality and Social Psychology, 101(5), 902-918.
Diekman, A.B., Sternberg, M., Brown, E.R., Belanger, A.L., & Clark, E.K. (2016). A goal congruity model of role entry, engagement, and exit: Understanding communal goal processes in STEM gender gaps. Personality and Social Psychology Review, 21(2), 142-175.

=======================

Notes and Responses to Reader Comments:

1. The authors thank Alice Eagly for helpful comments and criticisms on our first draft.

2. *We will soon address the issue of different variances in test scores between men and women, which was the key point of controversy in Larry Summers’ remarks in 2005. See the Pinker/Spelke debate for clear and conflicting presentations on that question. For a more recent analysis see Machin & Pekkarinen, 2008 and, especially, the supplementary materials (for an ungated summary click here).

3. The Morris (2016) study was added after a suggestion by Marco Del Giudice.

4.The most critical reactions to Damore often focus on his invocations of biology, particularly this line: ”I’m simply stating that the distribution of preferences and abilities of men and women differ in part due to biological causes and that these differences may explain why we don’t see equal representation of women in tech and leadership.” We cannot review the enormous literature on biology, culture, and gender in this post; we will link to appraisals by biologists when we find them. But we do think it important to make one comment: Nearly all academic psychologists who study personality, cognitive abilities, and interests, including gender differences, say that nature (biology) and nurture (childhood socialization, social norms, social roles) are both essential for explaining development, even if most researchers tend to focus their own work on one or the other (see Halpern, 1997; Halpern & LaMay, 2000; Neisser et al., 1996; Nisbett et al., 2012). Here, for example, is Eagly and Wood (2013):

“Is nature or nurture the stronger influence on sex differences and similarities? If asked, most psychologists would probably reply that the question is misguided. Obviously, both are influential. (p. 1)…” “We believe that the future of science pertaining to gender and sex differences lies in overcoming ideological and identity biases and formulating theories that effectively integrate principles of nature and nurture into interactionist approaches.” (p. 12)

5. Zell, Krizan, & Teeter (2015) metasynthesis was added after a suggestion by Elena Zinova.

6. Del Giudice, Booth, & Irwing (2012) was added after reviewing literature on multivariate effect sizes.

7. Hyde et al (2008) was added to the research table after a suggestion by Alice Eagly.

8. Stoet & Geary (2013) was added to the research table after suggestions by Alice Eagly and David Geary.

9. Miller, D.I. & Halpern, D.F. (2014). The new science of cognitive sex differences. Trends in Cognitive Science, 18(1), 37-45, was added to the further reading section after a suggestion by Alice Eagly.

10. Hall, J.A. (1978). Gender effects in the decoding of verbal cues. Psychological Bulletin, 85(4), 845-857, was added to the further reading section after a suggestion by Alice Eagly.

11. Stefanie Johnson: What the Science Actually Says About Gender Gaps in the Workplace was added to the Generally Critical section after a suggestion by Adam Grant.

12. Links (highlighted) added to this sentence: Furthermore, because women get better grades at all levels of schooling and score higher on a few abilities that are relevant to success in any job (e.g., reading comprehension, writing, social skills)… .

13. Both Diekman et al. (2011; 2016) papers added to the further reading section.

14. Uttal et al. (2013) added to the research table after a suggestion by Alice Eagly.

15. Our follow-up post was published September 4, 2017: The Greater Male Variability Hypothesis – An Addendum to our post on the Google Memo. This post explores the greater male variability hypothesis (see note 2).

16. Conclusions revised September 4, 2017.

Fe765ee1 be82 471a b92a 80baef0ac34c 1420x1072

Steve Pinker on What’s Happening at Harvard

May 23, 2025+Alice Dreger

+Academic Freedom

Two state legislatures are defining DEI in diametrically opposed ways.

May 15, 2025+Raheem Williams

great minds don't always think alike

Make a Donation

Your generosity supports our non-partisan efforts to advance the principles of open inquiry, viewpoint diversity, and constructive disagreement to improve higher education and academic research.

Donate Today

The Google Memo: What Does the Research Say About Gender Differences?

1) CURRENT COMMENTARY ON DAMORE’S MEMO

2) META-ANALYSES AND LARGE SAMPLE STUDIES OF GENDER DIFFERENCES

3) OUR CONCLUSIONS

Get HxA In Your Inbox

Related Articles