National Research Council Criticizes High-Stakes Testing

K-12 Testing

"An educational decision that will have a major impact on a test taker should not be made solely or automatically on the basis of a single test score."

-- National Research Council


A new study by the National Research Council (NRC) of the National Academy of Sciences sharply criticizes the use of standardized tests as stand-alone requirements for grade promotion, tracking decisions, and high school. These practices are escalating across the nation, even though they contradict the standards of the measurement profession and often even test publisher's guidelines. As the NRC report shows, this misuse of tests is in service to policies that are themselves mostly harmful. Unfortunately, the report allows some loopholes in the area of testing for high school graduation.


High Stakes: Testing for Tracking, Promotion and Graduation also calls for much stronger monitoring of tests and test (mis)use. It does not recommend any particular approach, but calls for investigating the benefits, drawbacks and potential effectiveness of various approaches to monitoring or regulating testing. The report was prepared by the Committee on Appropriate Test Use of the NRC's Board on Testing and Assessment (BOTA).


Graduation Tests

The report devotes one chapter each to tracking, promotion and retention, and graduation, arguing that the measurement issues for each are somewhat different. Nonetheless, the report continually repeats, though with some caveats, that these decisions should not be made solely on the basis of a test score.


The Committee appears to have retreated from this position in the face of the reality that nearly 20 states employ high-stakes graduation tests. The report seems to argue that while tests should not become sole hurdles for graduation, if they are so used they need to be very reliable, based on clear expectations, and only administered once students have had a real opportunity to learn the material on which they will be tested.


In part, the issue can be seen in the use of the term "single test score." According to one of the report's editors, Jay Heubert, some members of the committee viewed the opportunity to take graduation tests a number of times (as all states with such exams allow) as not using a "single test score." Others on the committee understood that simply repeating the same exam is effectively relying on a "single test score," as it does not provide for alternative but equivalent methods of assessment that might be more accurate and fair for some students.


The report does suggest allowing a student's strong performance on one indicator, such as grades, to offset low scores on a test, but it fails to include this option in its chapter on recommendations. Nor does the report investigate the plans of Minnesota and Delaware, which intend to allow students who do not pass the state's test to demonstrate their achievement by other means. Thus, on the key issue of high school graduation tests, the report simultaneously endorses the position that tests should not be used as sole hurdles for graduation, then undercuts that position and does not adequately explore the alternatives.


On the issues of tracking and grade retention, the report is stronger and does not insinuate caveats to the standard that tests should not be sole hurdles.  The report strongly emphasizes that large scale assessments should not be used to make decisions about children below age eight or grade three. It also clearly stipulates that if a national exam is implemented, it should not be used for making high-stakes decisions.


The report also includes detailed and very valuable chapters on testing students with disabilities and English language learners. It notes the serious limitations of current tests, measurement dilemmas of allowing accommodations or adaptations, and the need for significantly improved assessment practices in these areas.


Regulating Tests

As FairTest has long pointed out, the tests we administer our children are less regulated than the food we feed our pets. The NRC report explores the use of both professional standards and litigation as means for reducing testing malpractice, and finds both wanting. The professional standards have no means for enforcement; litigation is expensive, cumbersome, a too-blunt instrument, and often ineffective in the absence of legal prohibitions against test misuse; and the industry clearly will not police itself.


The report offers no recommendations, but rather discusses several "strategies for promoting appropriate test use." The alternatives include "deliberative forums" for public discussion of proper and actual test use; creation of an independent oversight body along the lines of Consumer Reports; labeling, as with content labels on food packaging; and federal regulation. The potential benefits and the drawbacks of each approach are explored, with the NRC concluding that "ensuring appropriate test use will require multiple strategies" and that further research on these and other policy options is needed.


One such option, the oversight body, will now be receiving an extended tryout. The Ford Foundation has awarded the National Commission on Testing and Public Policy one million dollars to devise a means to monitor and report on testing. The theory is that monitoring and publicly discussing testing programs will lead to improved testing practices. One of its first activities will be to consider the use of tests in college admissions.


- High Stakes is available from National Academy Press, 2101 Constitution Ave., NW, Washington, DC 20418; (800) 624-6242. The report is available on line at (it is about 300 pages long), but the executive summary and recommendations (Ch. 12) can be downloaded separately.