MAY 2, 2006
Good morning. I am Robert Schaeffer, Public Education Director of the National Center for Fair & Open Testing based in Cambridge Massachusetts. FairTest, as the organization is popularly known, was founded in 1985 by leaders of major civil rights, education reform and student activist groups to serve as a non-profit monitor of the standardized testing industry.
For more than two decades, FairTest has advanced quality education and equal opportunity by promoting fair, open, valid and educationally beneficial evaluation of students, teachers and schools. At the same time, we work to end the misuses and flaws of testing practices that impede those goals, placing special emphasis on eliminating the racial, class, gender, and cultural barriers to equal opportunity posed by standardized tests. There’s detailed information about our history, mission, goals and agenda on our website, www.fairtest.org
It is a privilege to be invited to testify before the New York State Senate Committee on Higher Education. As every assessment reformer knows, this committee was the creator of the landmark university admissions Truth-in-Testing laws which have forced many of the historically secret activities of the testing business into the public limelight. It is also a personal pleasure to be here today, since the committee chairman, Sen. Kenneth LaValle, the “father” of truth-in-testing represents Port Jefferson on Long Island, where I graduated from Earl L. Vandermeulen high school.
New York’s excellent truth-in-testing law – requiring periodic disclosure of admissions test questions and answers, as well as publication of internal research reports about the exams’ validity and fairness – is, unfortunately, an anomaly. For the most part, as the SAT scoring fiasco which brings us here today demonstrates, the industry whose products are used to enforce educational standards and accountability has no enforceable quality control standards and lacks basic accountability to students, teachers, and the public.
The truth is that there is stronger public oversight and control over the food we feed our pets than for the tests administered to our children.
To see how unaccountable the testing industry can be even with regard to its flagship standardized exam, let’s briefly review the timeline of what the Chronicle of Higher Education, the trade paper of record for colleges and universities, called the SAT scoring “Debacle”:
– On October 8, 2005, 495,000 high school students from across the nation took the SAT I exam. These test-takers paid a basic fee of $41.50 each and expected accurate scores to be delivered within three to four weeks
– Somehow in the next few days, through a process that has still not been adequately explained, some of the test papers were allegedly contaminated with water. We have heard a variety of explanations for this problem – with all due respect none of them “hold water.” If the root cause was heavy rain in the northeast, how were answer sheets from more than 30 states impacted? And, if the issue was humidity at the Austin test-scanning center, why was this a problem during a period of record drought in the area? Moreover, has it not been rainy or humid at hundreds of test centers around the country on every Saturday when the SAT has been administered?
– Soon after receiving their scores in late October, some students began complaining that the results were inaccurate. Several were so sure that errors had occurred that they were willing to request a little known special form from the College Board, wait days to have it delivered, fill it out and pay another $50 to have their tests scored by hand.
– Additional weeks passed before hand scoring took place. The College Board claims the “industry standard” for this process is three to five weeks, hardly an accelerated pace when an important test result is in question.
– Once hand scoring revealed a systematic error, the College Board and Pearson say they began rescanning answer sheets. Somehow this process took another full month, even though the answer sheets were already in their hands.
– During this entire period, as the 2006 admissions season moved into high gear, no warning about this problem was made to test takers, guidance counselors or college officials. Not a word was mentioned at any of the College Board’s regional meetings with its members nor to the news media
– Finally on March 6 and 7, 2006, — five months after the test was administered — the College Board told its stakeholders about the problem. But they did not tell them the full truth. The initial set of news stories reported that the errors were “less than 100 points.”
– Later than week, the College Board changed its story. Rather than “100 points,” errors were as large as 400 points.
– Then the next week, another update: 1,600 answer sheets involved in a still unexplained “special exceptions process” at the Educational Testing Service (ETS) had not been rescanned.
– And four days later, yet another correction. Approximately 27,000 more answer sheets had been found by Pearson still not rescanned five and a half months after the tests were administered and at least six weeks after the College Board claims that it first noticed “something odd” in the scoring process.
Reviewing this chronology in a memo to his association’s members, College Board president Gaston Caperton likened this series of events to a “sharp rock in my shoe.” I’d venture that students, guidance counselors and college admissions officers have a number of more painful analogies in mind.
The SAT scoring fiasco is hardly the only high profile error made by the testing industry in recent years.
In 2001, the Educational Testing Service admitted that the Graduate Management Admissions Test (GMAT) scores of nearly one thousand business school applicants were wrong by as much as 80 points. ETS explained that the problem developed when “some questions in the Quantitative section were incorrectly counted as not having been answered.” Test-takers were not informed of the problem until ten months after some had taken the exam. That means college seniors taking the GMAT to enter business schools in the fall semester had not been not notified of the mis-scoring until after admissions decisions were made and classes had begun. In fact, some test-takers may never have received notice of the error because they had long since moved from their undergraduate addresses. At least one business school has taken action to correct errors in its admissions process caused by the GMAT scoring error. Brigham Young University invited seven applicants it had rejected in the fall 2000 admissions cycle to reapply for 2001 because their scores were 40 to 60 points higher than originally reported. In addition a lawsuit has been filed in Federal court seeking damages, and ETS has lost the GMAT contract.
Even while the SAT scoring error story was unfolding, ETS tried to quietly settle another case in which an egregious scoring mistake damaged thousands of prospective teachers. More than 27,000 PRAXIS licensing exams were mis-scored; 4,100 test-takers were falsely told they had failed when they had actually posted passing scores. Jobs were lost, degrees denied, and lives disrupted. The financial settlement in this case totals $11.1 million.
Pearson, the third partner with the College Board and ETS in processing the SAT is a test-scoring multiple offender. In 2000, Pearson’s NCS subsidiary made a scoring error on the Minnesota high school graduation test that wrongly denied 50 students the diplomas they had earned. After a Minnesota court concluded that “the error was preceded by years of quality control problems at NCS” due to a “culture, emphasizing profitability and cost-cutting,” Pearson created a settlement pool of $7 million for test takers and paid $16,000 to each of the seniors who missed graduation.
Despite the Minnesota sanction, Pearson mis-scored another high-stakes test last year, this time the on-line graduation exam in Virginia. Sixty students were told they had failed when they actually had passed. Pearson ended up offering five students who had been kept from graduation $5,000 college scholarships
These are not isolated incidents. The SAT scoring problem is only the most recent in a long, growing chain of errors. Rather than go on, let me refer you to two excellent compilations of testing industry problems: Errors in Standardized Tests: a Systematic Problem by the Boston College-based National Board on Educational Testing and Public Policy and Margins of Error: The Education Testing Industry in the No Child Left Behind Era by the group Education Sector. I am providing copies of both for the Committee’s reference.
The testing industry problem examined here today and the many others documented by independent experts demand your immediate action. Here are five fundamental steps toward reform:
– Continue investigating to determine precisely the root cause of the SAT scoring error. Even with today’s testimony, we still do not know with certainty: how SAT answer sheets absorbed water; when and where this contamination occurred; why it never took place previously – given that rain and or high humidity during the test administration period is hardly a unique phenomenon; how the problem was detected; who was in charge of tracking down its source and impact; and, most importantly, why no one was notified of the scoring error until five full months after the SAT was administered and why it took several more weeks before the College Board and Pearson got their stories straight.
– Strengthen Truth-in-Testing by requiring manufacturers of college admissions exams to automatically return scored answer sheets and copies of questions to test-takers as a check on grading accuracy. This is already the practice with the Law School Admissions Test. We can predict the test-makers’ reaction: “it costs too much and will be burdensome to implement.” That’s precisely what they said nearly two decades ago when the New York State Legislature first debated Truth-in-Testing. Their excuses then and now are inaccurate: how complicated is it to photocopy each student’s response form and mail it back. Surely the true cost of this process is nothing closely to the $34 now charged to obtain a set of questions, wanted answers and a copy of the actual answer sheet. Note that money is hardly an object when it comes to other test-maker activities: According to the returns filed with the IRS by their firms, College Board president Caperton receives total annual compensation in excess of two-thirds of a million dollars annually; ETS president Landgraf gets over one million per year.
– Extend Truth-in-Testing protections for disclosure and research publication to cover K-12 tests, such as the Regents and other exams administered in elementary, middle and high schools as well as occupational licensing tests, such as the Bar exam, because potential errors in these exams can also have profound impacts on human lives. When a test is misused as an absolute hurdle that must be met for such high-stakes outcomes as grade promotion, gifted-and-talented program admissions, graduation, financial aid qualification or job certification, independent checks for accuracy and fairness are mandatory.
– Bar the use of “cut-off” requirements – absolute score minimums – to award scholarships at New York’s public universities. Both SUNY and CUNY campuses offer dozens of such scholarships. The SAT scoring debacle demonstrates that test results are too imprecise, too subject to error to be used as the sole factor to make such high-stakes decisions. This restriction would be consistent with the College Board’s own guidelines for proper test score use.
– Push for regulation of the standardized testing industry. There needs to be a public agency along the lines of the Food and Drug Administration that requires test to be proven the equivalent of “safe and effective” before they can be administered. Independent oversight would also include a system to receive and investigate claims of “defective products” and bar repeat offenders from the marketplace. The notion that standardized testing companies can regulate themselves is ludicrous. Just look at how the College Board and Pearson mismanaged the SAT scoring error fiasco. Even in its final stage, the test-makers continue to avoid any external oversight: the firm hired to “investigate” the incident, Booz Allen Hamilton, is the same company that received more than $3 million dollars for regular consulting work with the College Board last year. Oversight independent of the testing industry and its regular consultants is mandatory.
We simply cannot trust the testing industry to oversee itself. So long as standardized exams are used – and often misused – to make high-stakes decisions in areas such as college admissions, scholarship qualification, grade promotion, teacher competency, and school quality ranking, testtakers and the public need to be assured that these tools are accurate, fair and educationally sound.
The demand for high standards and accountability must include the testing industry itself.