What if we were to evaluate the performance of police officers based upon the crime rates of the neighborhoods they patrol? Or if we rated firemen on the number of fires in their station’s jurisdiction? How about if we calculated the effectiveness of doctors and nurses based on the health of their patients? Sound crazy? But that is exactly what we are doing when we tie a student’s test score to a teacher’s evaluation. In all of these examples, there are variables of which the practitioner has very little if no control over.
Recently the Nevada Education Association hosted a Teacher Assessment and Evaluation Town Hall meeting, in which teachers had the opportunity to weigh in on the controversial topic of using student test scores to evaluate teachers. In attendance was Assemblyman Ozzy Fumo who is sponsoring a bill this legislative session to prohibit the use of pupil achievement data in teacher evaluations and removes the requirement that pupil achievement data must account for at least 40% of a teacher’s evaluation. In this post I will outline why I agree with Assemblyman Fumo’s AB212, and believe that student test data should never be used in evaluating teachers.
In an attempt to improve American education, reformers have targeted teaching quality as the single most impactful variable affecting student learning. That by holding teachers accountable, and weeding out the weak and ineffectual teachers, American education will improve. Stanford University economist, Eric A. Hanushek estimated that top performing teachers helped students gain more than a year’s worth of learning, and those students taught by an underperforming teacher only grew by half a year. We obviously want the best performing teachers teaching our students, but the question becomes, how do we empirically know who the best teachers are? In an influential 2009 report, the TNTP found that 99% of teachers in 12 districts were ranked satisfactory on evaluations and that tenured teachers were almost never fired. The report called into question the validity of traditional administrator observations of teacher performance and raised the valid question, how can 99% of teachers be ranked satisfactorily when student’s achievement data suggests that students were not making significant educational progress? As a result, policymakers sought a more objective way to evaluate teachers, one that would be free from the personal bias that may taint traditional approaches. Using student achievement data to evaluate teachers seems like a reasonable approach, good teachers teach well, their students learn the material, and then take a test to measure mastery of a subject. Good teachers can be separated from bad teachers based on their student’s scores. Except as all classroom teachers know, there are a number of variables that may influence how a student performs on a standardized test. Some examples of variables outside the teacher’s control include students with special needs, students whose native language is not English, students from economically disadvantaged households and neighborhoods, parental abuse, gangs, drug and alcohol use, food and shelter insecurity, student lethargy and many others. In 1966 James Samuel Coleman, a sociologist, theorist, and empirical researcher published the Equality of Educational Opportunity Report, or Coleman report. This report concluded that “socioeconomic status, home life, and peer culture had a greater impact on student learning than did curriculum and instruction.” According to Stanford University Education Professor, Edward Haertel, “out-of-school elements account for 60 percent of the variance in student scores while the influence of teachers was responsible for around 9%.” Yet, the Nevada legislature wants to use these test scores to account for 40% of teacher evaluations. Based on the research, this is fundamentally unfair to our teachers.
The National Education Association contends that teacher evaluations today operate on a rewards-and-punishment system that aims to measure the effectiveness of teachers, categorize and rank them, then reward those at the top and fire those at the bottom. As a current classroom teacher, I see other problems with using student test data to assess teacher performance. The most egregious use of test data to evaluate teachers is when the test data comes from a student you never taught. In the upper grades, only about 15-30 % of teachers instruct in subjects that are directly evaluated by standardized tests. For the rest of the teachers, they receive a rating based on how well students did across the campus regardless of whether that student was a pupil or not. In other words, 40% of a teacher’s evaluation could be based on students they don’t even know. In addition, the data produced by the test is not usually available to the teacher until the next year, by that time the students have all moved on. Teachers would much rather use a standardized test as a formative assessment, or assessment for learning rather than an assessment of learning. Furthermore, assessments of teachers using standardized tests will restrict the curriculum to only those topics that are “tested” significantly narrowing what teachers teach and children learn. Standardized tests are inherently limited, and cannot measure everything that makes education meaningful. I can’t tell you how many times I have heard teachers lament that they would love to do a hands-on project with their students, but they don’t have the time, they have to get through all of the testable material first. This should be of great concern as we try to transform the old Nevada economy based on service to the new Nevada economy. Students need to master the skills and dispositions that will make Nevada a technological and innovative leader, skills such as communication, collaboration, critical thinking and creativity, these “soft” skills are more important than ever but are being pushed out of the curriculum due to the emphasis on testable knowledge.
In addition, despite the fact that Standardized tests have been in use for quite some time, there has not been a significant increase in student achievement. With the passage of the No Child Left Behind Act in 2002 the US slipped from 18th in the world in math on the Program for International Student Assessment (PISA) to 31st place in 2009. Similarly, there was a drop in science scores while reading scores remained the same. In May of 2011 the National Research Council report found no evidence that test-based incentive programs are working: “despite using them for several decades, policymakers and educators do not yet know how to use test-based incentives to consistently generate positive effects on achievement and to improve education.” Furthermore, a 2001 study published by the Brooking Institute found that 50-80% of test scores improvements over several years were temporary and “caused by fluctuations that had nothing to do with long-term changes in learning.” Never mind that the multi-billion dollar testing industry has made many costly scoring errors, unnecessarily resulting in increased stress for all stakeholders including students, teachers, and administrators.
Teacher evaluations based on their student’s test score’s will result in the unintended consequence of teachers teaching to the test. Drill and kill test preparation will replace sound pedagogical approaches in this new high-stakes testing environment, especially if student performance on standardized tests is tied to compensation. Furthermore, teachers and administrators will be placed in an unenviable position where they may feel compelled to cheat to raise student test scores for fear of punitive actions taken against them. There have been numerous cheating scandals crossing six states and the District of Columbia as revealed by a 2011 USA Today investigation. In one of the most egregious examples of school cheating, 178 Atlanta public school teachers and administrators in 44 schools across the state were found to be involved in cheating on standardized tests, the stick and carrot approach is not appropriate for the education field and should not be tolerated.
These are but a few of the arguments why student test scores should not be used to evaluate teachers. I do believe that standardized tests should be used as formative assessments to guide practice and to help identify teachers who need assistance in the classroom. I do believe in teacher accountability, but we already have that with the Nevada Educators Performance Framework (NEPF) a fourteen-page rubric covering five standards of performance. The standards are as followed: new learning is connected to prior learning and experience, learning tasks have high cognitive demand for diverse learners, students engage in meaning-making through discourse and other strategies, students engage in metacognitive activity to increase understanding of and responsibility for their own learning, and assessment is integrated into instruction. Furthermore, teachers are evaluated based on their professional responsibilities. This is accomplished by an additional fourteen-page rubric (28 pages for anyone keeping track) looking at things like commitment to the school community, reflection on professional growth and practice, professional obligations, family engagement, and student perception.
Clearly, the NEPF holds teachers to the highest levels of accountability and obviates the necessity of using questionable student achievement data as part of a carrot and stick approach to teacher evaluation. Oh and by the way, remember the police officers I mentioned at the beginning of this post, people who make life and death decisions on a daily basis, guess how long their performance review is… 1 page.
Do student test scores provide solid basis to evaluate teachers? (n.d.). Retrieved February 27,
evaluate-teachers
Editorial Projects in Education Research Center. (2015, September 3). Issues A-Z: Teacher
Evaluation: An Issue Overview. Education Week. Retrieved Month Day, Year from
overview.html/
Katz, D. d. (2016). Growth Models and Teacher Evaluation: What Teachers Need to Know and
Do. Kappa Delta Pi Record, 52(1), 11-16. doi:10.1080/00228958.2016.1123039
NEPF Rubrics. (2012). Retrieved March 04, 2017, from
cs/
Standardized Tests - ProCon.org. (n.d.). Retrieved February 25, 2017, from
http://standardizedtests.procon.org/
Teacher Assessment and Evaluation: The NEA's Framework. (n.d.). Retrieved February 26,
Teacher Evaluation Should Not Rest on Student Test Scores (Revised 2016). (n.d.). Retrieved
February 27, 2017, from http://www.fairtest.org/teacher-evaluation-fact-sheet