7. The consequences of an assessment procedure are the first and most important consideration in establishing the validity of the assessment.
Tests, checklists, observation schedules, and other assessments cannot be evaluated out of the context of their use. If a perfectly reliable and comprehensive literacy test were designed but using it took three weeks away from children’s learning and half the annual budget for instructional materials, we would have to weigh these consequences against any value gained from using the test. If its use resulted in teachers building a productive learning community around the data and making important changes in their instruction, we would also have to weigh these consequences. This standard essentially argues for “environmental impact” projections, along with careful, ongoing analyses of the consequences of assessment practices. Responsibility for this standard lies with the entire school community, to ensure that assessments are not used in ways that have negative consequences for schools and students. Any assessment procedure that does not contribute positively to teaching and learning should not be used.
By asserting that procedures cannot be evaluated out of the context of their use, this standard puts assessment, teaching, and learning back together. It asserts that simply devising a more detailed or more complex test will not by itself result in a more valid assessment. If an assessment procedure has adverse motivational consequences for school communities, segments of school communities, or individuals, then the procedure is invalid.
Adverse consequences from assessment can arise in a variety of ways, such as in these examples:
- Assessment techniques that very publicly value only a narrow range of literacy activity or very controlling forms of reading and writing (as opposed to a more critical literacy) enforce a narrowing of the curriculum for students. This routinely occurs in the United States through high-stakes accountability testing. Classroom assessment practices can have the same effects, sometimes as a consequence of high-stakes testing practices. This occurs when, for example, classroom assessment focuses on worksheets and multiple-choice tests or when evaluative feedback on student writing focuses on spelling and grammar and not on students’ thinking, substantive content, or organization or when classroom assessment focuses centrally on reading speed.
- Institutionally enforced commercial assessments reduce available school resources for teachers to conduct more instructionally informative assessments.
- Reporting procedures that focus on ranking or rating rather than on performance draw learners’ attention away from the process of learning, reduce their notions of literacy acquisition to a simple linear continuum, disrupt collaborative learning communities, make students and teachers defensive, and thus inhibit learning.
This standard rejects the unfortunately common argument that a given test is valid in spite of the fact that its use has problematic consequences (e.g., placing a student in a program that does not serve her well). Inquiring into the effects of assessment practices is never simple. It should be ongoing, capitalizing on multiple data sources and multiple perspectives, always recognizing that these efforts are likely to raise value-laden conflicts, such as the tension between the public’s right to know and the preservation of conditions that will foster learning. This standard means that assessment information should not be used for judgmental or political purposes if that would be likely to cause harm to students or to the effectiveness of teachers or schools. Schools have a responsibility to report assessment results to parents in a way that will assist, not hinder, students’ learning and parents’ understanding.
It is commonplace to talk about different purposes for assessment and to invoke the principle that the assessment must match the purpose for which it is intended. In practice, this has been largely ignored. Test publishers make claims regarding the validity of their tests regardless of the use to which they are put. In light of what we have learned about the ways tests shape curricular decisions made about students by teachers, administrators, and policymakers, a “user beware” attitude is unacceptable within the framework of this standard. If assessments are to be used for high-stakes purposes such as holding people publicly accountable, then they should be fully consistent with, and not a shorthand for, the assessment procedures used to provide teachers and students with knowledge of progress in the classroom. They must recognize the complexity of literacy in today’s society (see standard 5) and reflect the curriculum.
This standard has implications for our priorities when we choose assessment practices. For example, when a teacher observes and documents a student’s oral reading behaviors and uses that information to inform instruction, the data might not be as reliable, in a technical sense, as a norm-referenced test. However, in the context of the teacher’s professional knowledge, they are more likely to have productive consequences. Often assessments are chosen for technical measurement properties rather than for the likelihood of productive consequences for students and teachers.