Gender Bias in College Admissions Tests

Approximately 1.3 million high school students annually take the Educational Testing Service's SAT I, America's oldest and most widely used college entrance exam. It is composed of two sections, Verbal and Math, each scored on a 200-800 point scale. Test questions are almost exclusively multiple-choice; a few "student-produced response" questions require the student to "grid in" the answer.

The SAT I is designed solely to predict students' first year college grades. Yet, despite the fact that females earn higher grades throughout both high school and college, they consistently receive lower scores on the exam than do their male counterparts. In 2001, females averaged 35 points lower than males on the Math section of the test, and 3 points lower on the Verbal section.

A gender gap favoring males persists across all other demographic characteristics, including family income, parental education, grade point average, course work, rank in class, size of high school, size of city, etc.

Contrary to the test-maker's assertions, the gender gap does not merely reflect differences in academic preparation. ETS researchers Howard Wainer and Linda Steinberg found that on average, males score 33 points higher on the SAT-Math than females who earn the same grades in the same college math courses. The authors state that the "consistent under prediction of women's performance in college mathematics courses provides evidence that the SAT-M, used alone, is mismeasuring the profile of proficiencies that contribute to success in college."


American College Testing Program Assessment (ACT)
An alternative college entrance exam to the SAT is the American College Testing Program's ACT Assessment. This test is taken by approximately one million students each year, predominantly in the Midwest, Southwest and South. The ACT is composed entirely of multiple-choice questions and is divided into four sections: English, Mathematics, Reading, and Science Reasoning. The test is scored on a scale that ranges from 1 to 36.

Females also score lower than males on the ACT, although in recent years the gender gap has narrowed significantly. In 2001, women's ACT composite scores averaged .2 points lower than men's.

Although the ACT gender gap is smaller than that of the SAT, it is likely that this test also under predicts the abilities of young women. For example, despite the fact that identical percentages of male and female ACT-takers take Algebra II and Chemistry, females' scores on the Mathematics and Science Reasoning sections of the test are significantly lower than males'.

Graduate School Exams
Like the SAT and ACT, graduate school admissions exams also reflect score gaps between males and females. On the 1999-2000 Graduate Record Exam (GRE), the most widely used graduate school exam, females scored lower than males on all three sections of the test (each with a range of 200 to 800 points) - 9 points lower on the Verbal portion, 97 points lower on the Quantitative section, and 25 points lower on the Analytic section.

The exam widely used in medical school admissions, the Medical College Admissions Test (MCAT), also shows a persistent edge for male test-takers in 2000 - males outscored females by .1 points on Verbal Reasoning, 1.0 points on Physical Sciences, and .7 points on Biological Sciences, on a 1-15 point scale. Both groups received comparable scores on the Writing Sample.

The Graduate Management Admissions Test (GMAT), used by most business schools in the United States, also disadvantages females. The average scores for 1999-2000 test-takers showed women 34 points below their male peers on the 200-800 point scale.

The gender gaps on graduate school admissions exams take a particularly heavy toll on educational equity given the strict score cut-offs many programs employ. More so than undergraduate admissions, where high school grades and test scores are generally (though not always) considered in conjunction with one another, graduate schools more often set score minimums that adversely effect the admission of females and students of color.


Why the Gender Gap?

Although it is clear that university admissions tests under predict females' abilities, there is no definitive answer to what causes this bias. It appears that several factors contribute to the gender gap.


Biased test questions
A 1989 study by Phyllis Rosser, The SAT Gender Gap: Identifying the Causes, found that the vast majority of questions exhibiting large gender differences in correct answer rates are biased in favor of males, despite females' superior academic performance. Rosser found that females generally did better on questions about relationships, aesthetics and the humanities, while males did better on questions about sports, the physical sciences and business.

This conclusion is supported by an earlier study by ETS researcher Carol Dwyer, who provides some historical perspective on the gender gap in her 1976 report. She notes that it is common knowledge among test-makers that gender differences can be manipulated by simply selecting different test items. Dwyer cites as an example the fact that, for the first several years the SAT was offered, males scored higher than females on the Math section but females achieved higher scores on the Verbal section. ETS policy-makers determined that the Verbal test needed to be "balanced" more in favor of males, and added questions pertaining to politics, business and sports to the Verbal portion. Since that time, males have outscored females on both the Math and Verbal sections. Dwyer notes that no similar effort has been made to "balance" the Math section, and concludes that, "It could be done, but it has not been, and I believe that probably an unconscious form of sexism underlies this pattern. When females show the superior performance, 'balancing' is required; when males show the superior performance, no adjustments are necessary."


Multiple-choice format
A joint study by the Educational Testing Service and the College Board concluded that the multiple-choice format itself is biased against women. The study examined a variety of question types on Advanced Placement tests (like the SAT, made by ETS for the College Board and administered to college bound seniors) and found that the gender gap narrowed or disappeared on all types of questions (e.g. short answer, essay, constructed response) except multiple choice. Similar results were also found with the California Bar Exam and the SAT's English Composition Test with Essay. The researchers conclude, "The better relative performance of females on constructed-response tests has important implications for high-stakes standardized testing... If both types of tests measure important education outcomes, equity concerns would dictate a mix of the two types of assessment instruments."

The SAT is scored with a "guessing penalty," which deducts one-quarter point for every incorrect answer. Questions left blank are simply scored as zero. The intent of this policy is to make random guessing inadvisable. However, since one or two answer choices can usually be eliminated as obviously incorrect, it is often in the test-taker's best interest to make an educated guess.

Research indicates that males are more likely to take risks on the test and guess when they do not know the answer, whereas females tend to answer the question only if they are sure they are correct. Unwillingness to make educated guesses on this exam has been shown to have a significant negative impact on scores. 

The ACT does not have a guessing penalty, which may be one reason why the gender gap on that test is much smaller.

Another factor that contributes to the gender gap is the fast-paced, or "speeded" nature of the test. On some sections of the exam, students must answer as many as 35 questions (some of them requiring lengthy reading passages) in 30 minutes--an average of only 51 seconds per question.

Substantial evidence exists that females approach problem-solving differently than males; they are more likely to work a problem out completely, to consider more than one- possible answer, and to check their answers. While these are desirable traits in school and in life, they work against females on an exam that is supposed to predict their ability to do academic work.

Numerous studies have found that when the time constraint is lifted from the test, females' scores improve markedly, while males' remain the same or increase slightly. Un-timed administrations of the test still show a small score difference between males and females, suggesting that speededness is only one of several factors that bias the exam against young women.

The Test-Makers' Excuse
Some test company officials have suggested that the gender gap is caused by the fact that more females take the tests than males. They argue that the larger pool of females includes more low- scoring students, which in turn reduces the average score for women.
In fact, research shows that controlling for these variables does not explain the gap. A study by L.M. Sharp, for example, finds no evidence that females' lower scores can be attributed to the larger number of women taking the exam, and concludes that the causes of the gap lie elsewhere than in the demographic makeup of the male and female testing populations.

Twice as many males as females achieve SAT scores over 700. If the scoring gap were caused solely by the larger pool of females taking the exam, females should still attain the same percentage of high scores as males. In fact, the opposite is true: the gender gap is largest in the highest score ranges.

Federal District Judge Walker, in a 1989 decision barring New York's use of SAT scores alone to award scholarships, concluded that "...under the most conservative studies presented in evidence, even after removing the effect of [factors such as ethnicity, parental education, high school classes, and proposed college major], at least a 30 point combined differential [out of approximately 60 points] remains unexplained."

Read more about the SAT, ACT, GRE, and MCAT