For the last 20 years, the world of public education has loved to debate the value of standardized testing and policies associated with the results. This debate has become an all-consuming distraction for some, as test scores are concise and easily digestible. Kids either pass or fail. Students in different demographic groups either meet standards or don’t. Debates thrive on binary choices, which standardized tests provide. Either they’re good for purposes of accountability, or they’re bad for kids because they’ve narrowed the curriculum. They’re either good for shining a light on the gross inequities in our schools or bad because of all the statistical noise in the results.
However, by arguing over such binary positions, we avoid having to wrestle with much more difficult questions having to with resource equity, the legacy of institutional racism, the role of poverty, issues of governance, and how to organize systems to make schools more effective. Binary debates don’t allow for such complexities. As we move toward recovery from COVID-19, though, educators find themselves immersed in these sorts of messy questions. And the pandemic-induced pause in standardized testing may be bringing about a welcome change in the terms of the standardized testing debate.
Assumptions undergirding standardized tests
Standardized testing has been part of public education for nearly a century. From the first use of IQ tests to the manufacturing-based redesign of school systems, authorities have craved measurement. Assumptions about the need for standardized tests have been reasonably straightforward. Public dollars fund public education, and standardized tests enable state and local authorities to determine the return on the investment of those dollars. However, it wasn’t until the advent of the standards movement in the late 1980s and early 1990s, and the resulting enactment of the No Child Left Behind Act in 2002, that state standardized testing gained its strong foothold in K-12 public education. Since that time, test results have become the impetus for both increased investments in schools that struggle to meet state standards and public disgrace for those that fall short. While there have been recent efforts under the Every Student Succeeds Act to broaden the public’s conception of success through using multiple sources of data to gauge school effectiveness, standardized testing remains hegemonic in American public education. That hegemony rests on a set of flawed assumptions.
The first flawed assumption is one of high modernism. In a 2020 article in the Harvard Educational Review, Jack Schneider of University of Massachusetts Lowell and Andrew Saultz of Pacific University describe how “the high modernist state . . .
develops quantitative systems designed to measure performance. Such systems, by their nature, tend to ignore the nuances of reality on the ground.” Performance-management systems built on state tests dismiss the human element of change management. Policies based on them are grounded in a technocratic belief about how to improve teaching and learning. Their adherents’ lack of faith in the professionalism of educators and admiration of simple data over all else have not led to appreciably better results, at least as measured by the tests.
Technocracy assumes that through performance-management systems that start with a standardized test, adults in schools will be spurred to action. Students take a test, scores come back, schools analyze the data to determine actions meant to improve results, new initiatives are funded, educators undergo professional development, the new initiatives are implemented, students take the test again in the spring, and results are reviewed to determine what new actions should be taken. This pattern, repeated in school systems throughout the country ad nauseam, doesn’t take into account how adults actually learn new skills that will help them improve their practice.
The standardized-testing hegemony also assumes that test scores are valid measurements of student performance and that English language arts (ELA) and mathematics are the two most important content areas for students to master. Standardized tests can be useful as blunt instruments to help us understand the progress, or lack thereof, that a school or system is making. They can also be the first step in a deeper inquiry into teacher effectiveness and student need. Their use within policies that affect the real life of schools, however, is specious, as they only account for one aspect of student learning. Moreover, some of those polices, such as teacher evaluations that heavily rely on scores and classifications of some schools as failing or succeeding, are simply not grounded in sound practice. State tests tell us something, but it’s more about student demographics and factors external to schools than anything else.
The assumption that ELA and math are the most important content areas for students to master seems obvious. After all, we all need to read, write, and calculate to live in the world and be eligible for good-paying jobs. Yet critical thinking, problem solving, emotional intelligence, scientific literacy, and civics are also essential to becoming a well-educated person. These domains are difficult to measure at scale, and definitions of them are likely to differ among districts within a state and throughout the country. Civic education in liberal Montgomery County, Maryland, where I was superintendent of schools, can lead to the Board of Education debating whether students should have excused absences for attending protests in neighboring D.C. Another board, even in a blue state like Maryland, may not agree. Efforts to address student social-emotional learning and emotional intelligence are increasing exponentially, yet measuring these competencies is a new and tenuous proposition, and there isn’t agreement throughout the country on what the terms even mean. Using them within a state accountability system, let alone a federal one, is fraught with difficulties. Thus, we’re left to succumb to the narrow dominance of ELA and math as indicators of success. Curriculum has been narrowed as a result, and an entire generation of students has come to believe that their value is reflected in their test scores.
A different set of possibilities
We know the mixed results of the last 20 years of reform. We don’t, however, know the counterfactual. What would our schools look like today, and how well prepared would our children be, if we had focused our collective energies on what we know actually works to improve outcomes? Let’s say, for example, that the enormous effort put into convincing state legislators of the value of standardized testing had gone instead toward creating equitable funding formulas. Money matters in public education, especially for serving the most vulnerable students. Yet, funding formulas still perpetuate gross inequities. Moreover, given what we know about the relationship between a family’s economic status and student achievement, what would have happened if the collective effort of reforms had focused on addressing entrenched societal issues? If states and communities had invested in wraparound services, public transportation that enables the working poor to get to jobs, better housing, affordable preventative health care, and food security, we would expect our children to have better academic outcomes.
Within schools, what would have happened if all of the money and energy that’s gone into standardized testing had instead been invested in strengthening the profession and scaling what actually works to improve schools? We know that great schools have a foundation of internal accountability, constant and collaborative professional learning, strong family engagement, distributed leadership, and a rich curriculum (among other things). What if districts had been incentivized to organize change efforts around those elements, rather than focus solely on ELA and math achievement? Simply put, we don’t know what would have happened. We do know, however, that what we have invested in hasn’t brought us the results that we need for our kids.
So, what can we do going forward? How might we ensure excellence, equity, and accountability without annual standardized tests? I want to be clear; I am not arguing for a collective dismissal of the use of data, nor do I believe we should be shirking accountability. Actually, I think the opposite. We need more and better data to make decisions in schools, districts, and states, and we need better accountability metrics to weed out mediocrity. We must also continue to disaggregate data according to student demographics, given their correlation to school success and our moral imperative to address the needs of the most vulnerable. The COVID-19 crisis has revealed to a larger public the gross inequities in our schools and the needs of our children, and it provides an opportunity to create a new baseline.
We can begin by ending the practice of giving annual tests to all kids in most grades. If the unit of change for improvement efforts is (or should be) the school, then a sample of students can fulfill external accountability requirements while educators within schools, with help from the central office, focus on the needs of individual students and staff based on more authentic assessments of student learning. Many other nations use a sampling methodology rather than a census one. The National Assessment of Educational Progress (NAEP) is a sample of students, as is the Program for International Student Assessment (PISA), and both are used to decry and celebrate performance. Sampling is an effective method to understand patterns and can be used as a launching pad to probe further into deficiencies and strengths.
A sample of 3rd graders in every school in reading every year can be one part of a process to determine whether a school’s approach to literacy is effective. Start this spring, with a quick turnaround time for results, and then use those data at the local level to make decisions about interventions and supports. The standardized test in this scenario can allow a school and district to understand its status relative to a standard set by the district or state. A board and superintendent can then allocate the necessary resources to help the school improve its practice, without having to subject every student to a standardized test. I believe we should also do sample testing for ELA and Math in 5th and 8th grades. These would give a district’s leadership a sense of where each school stands in relation to similar schools and whether its leadership and improvement strategies are having the desired effect. There would, of course, need to be a commensurate investment in formative assessments and professional learning to increase teachers’ knowledge and skills about assessment. But the return on that investment would be significant, as it would allow teachers to more quickly address individual students’ needs. This approach would significantly diminish the amount of time and energy spent on preparing each child to pass a standardized test, which could then be put toward actually improving teacher practice.
In high schools, we have enough measures to determine whether students are ready for college and careers. Industry certification tests, Advanced Placement and International Baccalaureate exams, the SAT and ACT, higher-level courses, writing samples, lab reports, community service, extracurricular participation, and acceptance into college without the need for remediation are just some of the indicators available. Rather than spend time and energy on a standardized exam in 10th grade, let’s focus on eliminating low-level courses and revising curriculum to be engaging, problem based, and culturally relevant. Those steps actually have an impact on student performance. And in the meantime, we must invest in developing teacher capacity to conduct and use formative assessments to adjust instruction.
Federal and state oversight of public schools through the high-modernist regime of standardized testing isn’t having the desired effect. If ever there were a time to reduce tests and help orient schools toward equitable instructional practices that actually increase student achievement, that time is now. Surely, many will see this as a retreat from accountability and a dismissal of equity, as annual census testing is assumed to ensure that we know each child’s status and the gross inequities in our schools. But we also know it hasn’t worked for the last 20 years. While teachers, parents, communities, and schools have come to value the role of public education more than ever during the COVID-19 crisis, we owe it to them to focus our collective efforts on what actually works, not on a theory of action that has been proven false.
Note: This column was also published in Education Next on Dec. 22, 2020.
ABOUT THE AUTHOR

Joshua P. Starr
Joshua P. Starr is the managing partner at the International Center for Leadership in Education, a division of HMH, based in Boston, MA. He is the author of Equity-based Leadership: Leveraging Complexity to Transform School Systems.

