Thursday, September 2, 2010

How should we "grade" teachers? It's a fair question for a profession that has long been conducted by individuals behind closed doors. Each of us, as a student, has no doubt had a teacher we felt belonged in a different job. Perhaps we even dreamed about a defiant moment in which that teacher's shortcomings were publicly revealed, like the climax of a Roald Dahl novel. But even Matilda might balk at publishing in the local newspaper teacher evaluations made with an unreliable system.

Recently, a method called the "Value-Added Model" has been used to evaluate teachers. According to an article in the New York Times, Race to the Top, the national program which will allocate funding to states, encourages the use of this method: It rewarded states that used it and disqualifying those that did not link student scores to teacher performance. Moreover, a recent series in the L.A.Times published evaluations of elementary teachers made through this model. While I do think teaching should be evaluated, there are several problems with the value-added model about which teachers should be aware.

First, it's based on student test scores. Test data for students at the end of one grade is compared to test data for the same students at the end of the next. If they've improved, it concludes that that improvement is "value added" by the teacher. But are test scores the best measure of what students have learned? Do they capture how well students write a supported research paper, or compose a poignant poem? How well they collaborate and empathize? An even more recent NYT article suggests that tests which measure this may be in the works. But the question remains: should we judge teachers based on student test scores?

Second, the model assumes that scores at the end of one grade predict scores at the end of the next. That is, a student who scores 70% on a standardized math test in 7th grade, if she has a good teacher, should score the same or higher on a similar test at the end of 8th grade. But does it matter if she's moving from algebra to geometry? if she's moving from one district to another? If she's moving from her mom's house to her dad's? What about the effects of summer vacation, during which time students who don't participate in academic activities often forget much of what they've learned? And what are the implications of setting the bar differently for students based on a statistical bell curve, predicting that some students are bound to score higher or lower than others?

Third, the model does not distinguish based on the starting point. That is, if students arrive at a class scoring 90% and leave it scoring 92%, there is less "value added" than if they come in at 30% and leave at 60%. The Times article describes such a situation for a Physics teacher, who receives a smaller bonus as a result. And when teacher bonuses are made public, she has to explain to her students why she didn't earn a higher one.

Fourth, the model does not account for collaboration. Only his teacher receives credit for the improvement of a student who receives extra help and support with school from parents, other teachers, other students, or school programs (and guess who shoulders responsibility for the student who doesn't have those advantages?).

In short, the Value-Added Model is based on dubious data that not only ignores contextual factors but may even reinforce inequalities and obscure the complex, collaborative work of teaching and learning. What's the alternative? Perhaps not the long-standing once-a-year evaluation by an administrator, but regular, rigorous observation by principals and peers. Or what about the portfolio system used by the National Board of Teacher Certification since the 1980s? Maybe a combination of these means with student test-score evaluations? How are other important public professionals like doctors and lawyers evaluated?

Teaching should be evaluated. But how teachers should be graded is not as simple as separating the Miss Trunchbulls from the Miss Honeys. And public humiliation of teachers based on a faulty system is worse than a newt in your drinking water.

2 comments:

  1. I love that you used Matilda in this post-one of my all-time favorite books :)

    It's interesting to see how once again there is pressure to standardize education--to make an assembly-type model to achieve the greatest efficiency, in almost a competitive, business-like sense. I agree with you--teachers needs to be evaluated and held accountable. In my very short teaching career, I have seen what happens if they are not held accountable to high standards (and are not truly invested in their students). One thing I have reflected on in the past two years is the high level of accountability and reflection that the Teacher Ed program at State required. I know the tools you guys used gave a clear picture of our growing practice as teachers--although I'm not so sure how they reflected our students' learning since our students were our peers :) I found these measurements--and how they were implemented very supportive in my development--something that I have missed in the irregular observations by my administrators and mentors. I've proposed some of these measures in my School Improvement Plan team meetings (e.g., video taping our teaching when we are implementing new teaching strategies as highlighted in our SIP, peer evaluation), and hope to see some of these things come to fruition this year.

    My fear with solely using student test scores in measuring teachers is that even those scores are not reliable. As you mention, there are a myriad of factors that might cause a student to score lower than anticipated. One thing I've noticed with my students is the negative feelings they have towards standardized tests. Not only does our district use the state-mandated MEAP and MME, but my kids are also required to take the ELPA, a state-required language assessment similar to MEAP in format. This is in addition to quarterly Scantron testing (an online assessment that is used to gauge their success on the MEAP/MME) and Study Island testing (a tool we're expected to use regularly in our teaching that also mimics the MEAP). The kids hate it--and obviously this affects their motivation. This over-testing can lead even the brightest student to score low on the MEAP/MME if he or she doesn't buy in to the testing game--obviously making the test an unreliable measurement.

    ReplyDelete
  2. A recent development: this review of a new study on Teacher Evaluation confirms the idea that a push for high-stakes evaluation based on test scores and value-added is based on "common sense" rather than on research, innovation, and what schools and teachers actually do/need.

    http://nepc.colorado.edu/thinktank/review-teach-eval-TNTP

    The danger is that "common sense" is persuasive, even for those affiliated with teaching. How do we convince people that what seems obvious--that teachers should be judged by student performance on standardized tests--is more complex, without sounding like defensive eggheads?

    ReplyDelete