Florida Test Scoring Error Highlights Exam Flaws

K-12 Testing

FairTest Examiner - July 2007

"FCAT Fiasco: Scores Wrong" read the headlines across Florida. After more than a year of bragging about soaring student performance on the 2006 Florida Comprehensive Assessment Test third grade reading exam, state officials glumly admitted that scores were inaccurately high. The error was discovered when local superintendents and principals were unable to reconcile a plunge in scores for 2007 with the reported gain from the previous year.

Their complaints led to an internal review, which concluded that an "equating error" occurred when anchor questions, items that were supposed to assure test score consistency from year to year, were administered in a different order. More than 200,000 tests had to be rescored by Harcourt Assessment, the state's contractor.

The public policy consequences of the error were widespread and significant. The FCAT is among the most misused tests in the nation. Its results are used to determine grade retention, graduation, school grades, voucher eligibility, state sanctions and teacher bonuses, as well as serving as the measure for Adequate Yearly Progress under the federal No Child Left Behind Law. Thus, all decisions made on the basis of 2006 FCAT reading results were called into question.

Unlike his predecessor, Jeb Bush, who rejected any criticism of the test, Governor Charlie Crist, a moderate Republican, pledged a full review of the scoring problem. "It doesn't raise my confidence. I can tell you that," he said. The chair of the state Senate Education Committee, a long-time FCAT supporter, went even further stating, "This is a shot below the water line, and it's self-inflicted."

The Florida Coalition for Assessment Reform (FCAR) applauded the administration's pledge of openness and transparency while endorsing "strict accountability" for the FCAT. In a public letter to Governor Crist, FCAR called for a thorough investigation of the test's design, scoring and use (http://www.fcarweb.org/docs/Crist-Blomberg_letter_5_07.htm). Along with a review of the specific circumstances surrounding the equating error, FCAR sought a more comprehensive examination of the test's validity, reliability and fairness, including hearings across Florida. (FCAR is part of FairTest's Assessment Reform Network.)

State officials responded almost immediately by establishing a review committee and invited FCAR as well as the state teachers association, another test critic, to nominate members. Dr. Robert Lange, a retired University of Central Florida education professor, who has published several scathing reports about the misuse of the FCAT for grade retention, now sits as FCAR's representative on that panel.

Unfortunately, the state education department now appears to be trying to limit the scope of the review to little more than the technical causes of the scaling error. Plans are to hire a consulting firm to conduct that investigation rather than holding broad public hearings. While many of the potential bidders are academic institutions, several are in the business of manufacturing or selling standardized tests, creating the potential for conflicts-of-interest if not a coverup.

Though all of the flaws of the FCAT may not end up being aired, the incident has exposed the fallacy of relying on any one test to make high-stakes educational decisions while increasing the clout of local testing reformers. One result is that the 2008 session of the Florida legislature is expected to take up proposals to scale-back use of the FCAT as the state's primary accountability mechanism.