State Assessment Systems Need Major Reforms

K-12 Testing

Two-thirds of state K-12 student assessment systems do not reach even the middle level of system quality, according to a new FairTest study of testing programs in all 50 states. One-third of the systems need a complete overhaul and another third need major improvements if they are to provide support for high quality teaching and learning.


The report, Testing Our Children: A Report Card on State Assessment Systems, evaluates how state assessment practices measure up against the standards set forth in Principles and Indicators for Student Assessment Systems, a 1995 publication by the National Forum on Assessment, a coalition of education and civil rights groups.


In the new study, supported by the Joyce and Ford Foundations, each state testing program is ranked based on a five-point scoring guide. Only one state, Vermont, reached the top level and is close to having a model system. Six reached level four, needing "modest improvement." All other state assessment systems require more fundamental changes: seven need "some significant improvements," 17 need "many major improvements," and 16 need "a complete overhaul." Three states did not have enough of a state system for scoring.


The most important factor in the FairTest evaluation is whether the assessment system helps improve student learning, the first principle of the Principles. Testing Our Children shows that although a small number of states have significantly improved their assessment systems, the majority have made only minor improvements over the past decade. While many have changed the label from "test" to "assessment," they continue to rely on traditional standardized exams that differ little from those administered a decade ago. And while most states have adopted content standards, many tests are still not based on those standards and many more fail to adequately assess all areas of the standards.


Worst States

The states with the worst rankings generally rely on multiple-choice tests, use norm-referenced exams, and mandate high stakes, such as high school graduation. According to the Principles, multiple-choice tests measure only a narrow range of achievement. Relying on such tests leads schools to place too much emphasis on rote memorization over critical thinking. They also tend to have a disproportionately negative impact on low-income and minority students, who are often subject to a dull, "drill and kill" curriculum by teachers who are under pressure to raise test scores quickly. In addition, too frequent testing and test coaching takes valuable time away from more meaningful instruction. These problems are worst when stakes are highest.


The FairTest study also found that states in the south have the most testing and make the least use of performance assessments. Most of the states with high school exit exams also are in the south, many of the rest are in the northeast. Both regions have above-average concentrations of African American students. In its 1988 report, Fallout From the Testing Explosion, FairTest found that districts with large proportions of African American students tested more often. Thus, it appears that African American students are disproportionately likely to be heavily tested with low-quality tests and to be subject to high stakes exams.


Best States

The states with the top rankings in the FairTest study tend to rely on multiple measures of achievement, with strong use of performance assessments or portfolios, and do not make high stakes decisions based on the results of any one exam. Some rely on sampling for school and district accountability rather than testing every child.


These latter assessment systems are more in line with the Principles because they seek to measure and support critical thinking, creativity and the ability to use knowledge in real life situations. They also provide students from widely different backgrounds with multiple methods for demonstrating what they have learned.


Despite the many problems cited in the report, some improvements in state testing programs have been made over the past decade. These include the use of writing samples in 38 states (though many of these date from the 1980s and they often are quite narrow, simply requiring students to write to a prompt) and constructed-response items in many states (though their use remains far too limited), and in more attention to bias reduction.


Only a few states rely primarily on extended constructed-response, performance and portfolio assessments. These include Vermont, Maine, and Kentucky. A few other states make substantial use of such methods, including Maryland, Connecticut, Rhode Island and Colorado.


On the positive side, bias reduction efforts are fairly extensive. However, most states do not do a good job of including students with special needs in state assessments by using appropriate accommodations or alternate tests. Professional development in assessment remains weak in most states. Moreover, few states do a very good job of reviewing their state testing programs.


Testing Our Children concludes that the promise and opportunity of school reform in the 1990s is to break with the outmoded factory model of schooling. To help accomplish this, assessment must be fundamentally restructured. The Principles provided a blueprint for how this can be done, but this report finds that most states have a long way to go to meet the Principles' objectives of a truly fair and educational sound system.


-- To order the summary, the full report, or the Principles and Indicators, use our order form.