Gettysburg Address Flunks Robograder
Do you want your writing graded by this machine?
Machine Scoring Fails the Test
New Position Statement on Machine Scoring from NCTE, April 2013
[A] computer could not measure accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity in your essay. If this is true I don't believe a computer would be able to measure my full capabilities and grade me fairly. -- Akash, student
[H]ow can the feedback a computer gives match the carefully considered comments a teacher leaves in the margins or at the end of your paper? -- Pinar, student
(Responses to New York Times The Learning Network blog post, "How Would You Feel about a Computer Grading Your Essays?", 5 April 2013)
Writing is a highly complex ability developed over years of practice, across a wide range of tasks and contexts, and with copious, meaningful feedback. Students must have this kind of sustained experience to meet the demands of higher education, the needs of a 21st-century workforce, the challenges of civic participation, and the realization of full, meaningful lives. (Read the entire position statement.)
Computer-graded writing continues to spread nationally, but at what cost? Many colleges use machines to score placement test writing and the introduction of Common Core standards assessments in 2014 will likely expand the use of robograding. Yet, "while (robograders) may promise consistency, they distort the very nature of writing as a complex and content-rich interaction between people." (CCCC Writing Assessment: A Position Statement)
The Conference on College Composition and Communication, a constituent group of NCTE, supports human assessment because assessment that isolates students and forbids discussion and feedback from others conflicts with what we know about language use. We write for social purposes, so it would follow that only a human can accurately critique communication.
"If a student's first writing experience at an institution is to a machine, this sends a message: writing at this institution is not valued as human communication -- and this in turn reduces the validity of the assessment." (CCCC Position Statement on Teaching, Learning, and Assessing Writing in Digital Environments)
"E-Rater doesn’t care if you say the War of 1812 started in 1945," notes Les Perelman in The New York Times. Despite what the grading companies say, their products are far from perfect. As NCTE member Les Perelman found, computer-grading machines are fallible. On a scale of 1 to 6, the Gettysburg Address earned a 2. Perelman also found several ways to game the machine. Machines may grasp how to grade great grammar; they cannot enjoy the emotions of evocative essays or see sparks of subtle smarts.
Computers grade student writing through rigid algorithms that do not account for various forms of quality writing. For example, Perelman found the machines overvalue magniloquence. And they downgrade sentences starting with “and” or “but.” But with so much riding on large scale testing, students would be loath not to satisfy the machines known as e-Raters or robograders.
The movement against robograding must start at the top. If colleges and universities begin using computers to grade student writing, high schools will begin preparing their students for computer-graded writing. As Crispin Sartwell notes, writing to computers could reduce all writing to templates.
Anne Herrington and Charles Moran took an extensive look at computer graders and noted several problems. If a computer can grade thousands of papers at once, why not put all those students in one class? Why not use one teacher to record an online lecture for thousands of students? Gone would be the demand for teachers and the special expertise they offer.
Computers may be cheaper and faster than humans, but they do not possess the requisite skills of comprehension to accurately grade student writing.