Thursday, September 2, 2010

How should we "grade" teachers? It's a fair question for a profession that has long been conducted by individuals behind closed doors. Each of us, as a student, has no doubt had a teacher we felt belonged in a different job. Perhaps we even dreamed about a defiant moment in which that teacher's shortcomings were publicly revealed, like the climax of a Roald Dahl novel. But even Matilda might balk at publishing in the local newspaper teacher evaluations made with an unreliable system.

Recently, a method called the "Value-Added Model" has been used to evaluate teachers. According to an article in the New York Times, Race to the Top, the national program which will allocate funding to states, encourages the use of this method: It rewarded states that used it and disqualifying those that did not link student scores to teacher performance. Moreover, a recent series in the L.A.Times published evaluations of elementary teachers made through this model. While I do think teaching should be evaluated, there are several problems with the value-added model about which teachers should be aware.

First, it's based on student test scores. Test data for students at the end of one grade is compared to test data for the same students at the end of the next. If they've improved, it concludes that that improvement is "value added" by the teacher. But are test scores the best measure of what students have learned? Do they capture how well students write a supported research paper, or compose a poignant poem? How well they collaborate and empathize? An even more recent NYT article suggests that tests which measure this may be in the works. But the question remains: should we judge teachers based on student test scores?

Second, the model assumes that scores at the end of one grade predict scores at the end of the next. That is, a student who scores 70% on a standardized math test in 7th grade, if she has a good teacher, should score the same or higher on a similar test at the end of 8th grade. But does it matter if she's moving from algebra to geometry? if she's moving from one district to another? If she's moving from her mom's house to her dad's? What about the effects of summer vacation, during which time students who don't participate in academic activities often forget much of what they've learned? And what are the implications of setting the bar differently for students based on a statistical bell curve, predicting that some students are bound to score higher or lower than others?

Third, the model does not distinguish based on the starting point. That is, if students arrive at a class scoring 90% and leave it scoring 92%, there is less "value added" than if they come in at 30% and leave at 60%. The Times article describes such a situation for a Physics teacher, who receives a smaller bonus as a result. And when teacher bonuses are made public, she has to explain to her students why she didn't earn a higher one.

Fourth, the model does not account for collaboration. Only his teacher receives credit for the improvement of a student who receives extra help and support with school from parents, other teachers, other students, or school programs (and guess who shoulders responsibility for the student who doesn't have those advantages?).

In short, the Value-Added Model is based on dubious data that not only ignores contextual factors but may even reinforce inequalities and obscure the complex, collaborative work of teaching and learning. What's the alternative? Perhaps not the long-standing once-a-year evaluation by an administrator, but regular, rigorous observation by principals and peers. Or what about the portfolio system used by the National Board of Teacher Certification since the 1980s? Maybe a combination of these means with student test-score evaluations? How are other important public professionals like doctors and lawyers evaluated?

Teaching should be evaluated. But how teachers should be graded is not as simple as separating the Miss Trunchbulls from the Miss Honeys. And public humiliation of teachers based on a faulty system is worse than a newt in your drinking water.