Education and the Analysis of Student Tests: Current Trends and Recommendations for Practice
In both the practical realm of educational provision and in education policies and legislation, the need to effectively and accurately measure the efficacy of various teaching methods and educational program is of paramount concern. There is a legislative mandate that all children in the United States have a right to a free and equal public education, and ethical principles also insist that all students receive the same opportunities for learning and growth. Given the practical constraints of providing public education on such a broad scale, it is also important that most students progress at approximately the same rate, so that instruction can be kept meaningful for all students. While this often translates to teaching towards the bottom of knowledge and skill levels represented in the class, it ought to mean being able to identify struggles and problems and helping to overcome these at a higher rate of learning.
One of the methods — the key method in the United States and many other countries — for addressing these and other issues has been the implementation of standardized testing. These tests are administered to all or most students within a population (students/parents usually have the ability to opt out of such testing) and are meant to measure the relative knowledge/skill/ability level of each student in their peer group. While this would certainly help to ensure that a fair and equal education was being achieved for all students if the tests and the system were perfect, unfortunately this is far from the case.
Problems with Standardized Testing: The System
One of the primary issues with the use of standardized tests in the United States educational system is that the system itself and the education it provides is not standard. Though there are legislative and ethical mandates for the equanimity in the quality of education and access to educational opportunities, the very obvious truth is that different states, different school districts within those states, different schools within that district, and even sometimes different programs or academic “tracks” within these schools operate with very different resource levels and availabilities (Garcia, 2001; Spring, 2004). This means that the education provided in the programs, schools, etc. also differs greatly, creating different opportunities and different levels of access and making standardized testing inherently unfair — or one could argue, allowing the inequalities of the system to become starkly apparent (Spring, 2004; Phelps, 2005).
The problem is that standardized tests aren’t used as a pure measure, with observation and analysis the only outcomes. Instead, standardized tests and both individual and aggregate scores are used for a variety of practical and influential decisions, from determining which colleges will even consider a student’s application to allocating (or cutting) funding for public schools and even (in some places) determining teacher pay raises or cuts as well as affecting their job security (Spring, 2004; Sacks, 1999). Though the correction of the unfairness that unquestionably exists in the education system are a more direct and primary goals for education reformers, standardized tests contribute to this unfairness in no small measure.
It is not simply in resource allocation and the varying levels of educational opportunities, access, and effectiveness that create problems with the use of standardized tests. The entire point of standardized tests is that they are supposed to provide (in as much as is possible) an objective, empirical, and scientific assessment of learning progress in individual students; trends in schools, states, and the nation; impacts of changes to policy, curricula, classroom size, and other features impacting upon education; and more (Phelps, 2005; Sacks, 1999). Out in the most basic terms, standardized tests purport to carefully measure learning, and if they do not carefully measure learning they are useless or wrose, actually detrimental to educational practice and policy. Like any other scientific measure, then, there all of the attendant risks that could affect the precision, reliability, and validity of these tests, and if variables cannot properly be controlled then the measure’s results cannot be trusted.
Precisions is one issue that scientific tests must handle; a low level of precision at best gives results that aren’t especially useful, and at worst gives results that are entirely meaningless. Measuring a football field in yards makes sense; asking how long an inchworm is in yards doesn’t — the measure is imprecise (yes, decimals can be used, but the point is there are more precise ways to measure an inch worm). Standardized tests have not been shown to be very precise, necessarily, measuring broader concepts in a more general way than would be necessary to achieve truly meaningful results (Sacks, 1999). Others contend that the tests are growing more precise and are still useful, but the doubt here is significant (Phelps, 2005).
There are deeper problems than the precision of standardized tests, though. Serious questions have also been raised in the areas of these tests’ validity and reliability, which are two other fundamental concepts that must be examined in the design and performance of any scientific test. In order for a test to be reliable, its results must be consistent over time or amongst different members of a population; in the case of standardized testing, features that would ensure reliability include a clear and consistent interpretation of each question and its possible answers across the population of students taking the test, consistency in the administration of the test including materials and environment, and an expected consistency in results that would follow a normal statistical distribution. Not all of these consistencies exist with standardized tests, and therefore they are not necessarily as reliable as is often assumed, for cultural, economic, personal, and other reasons (Garcia, 2001; Sacks, 1999). Validity is also a key problem — whether or not a scientific method actually measures what it sets out to measure is clearly of major importance, and again it is not at all clear that this validity exists for all or even most standardized tests (Phelps, 2005; Sacks, 1999).
The ostensible goal of standardized tests in helping to ensure an equal and high-quality education for all students is laudable. It is far from clear that standardized tests can actually help achieve this result, however. More careful analysis of the testing itself rather than the results of the test is definitely called for.
Garcia, E. (2001). Hispanic Education in the United States. Lanham, MD: Rowman & Littlefield.
Phelps, R. (2005). Defending Standardized Testing. Mahwah, NJ: Lawrence Erlbaum.
Sacks, P. (1999). Standardized Minds. New York; Da Capo Press.
Spring, J. (2001). American Education. New York: McGraw Hill.