Saturday, February 26, 2011

Teacher Accountability and Rating

I know I've beaten this horse before, but more and new data (isn't that what we're always wanting) have made the idea of rating and accountability even more vexing. Let me explain using some testing outcomes from my school, 8th grade reading. Bear in mind, the outcomes I'm summarizing are all from the same tests, 8th grade reading WASL/MSP, 2006-2009. Also bear in mind that I'm assuming that test score outcomes are going to be used as a proxy for teacher performance.

The OSPI Report Card shows that the following percentage of our 8th graders passed the test (that means they got a score of 400 or higher). The number in parentheses following is the 7th grade (previous year) pass rate for the same group. (I joined this school in the fall of 2006, so include the 2006 score only to match up with the next set of scores.)

OSPI Report Card
2007--74.2 (65.9)
2008--72.1 (74.6)
2009--79.6 (73.2)

Their score is determined by simply averaging all the test scores for our group. Scores of 1 or 2 are 'not meeting standard,' while 3 or 4 signify 'meeting standard.'

1--MSP score of 375 or less
2--score of 376-399
3--score of 400-418
4--419 and above


This means that the average score for all the 8th graders on the reading test was 3.1 out of 4 in 2006, and so on.

Note that the percentage passing rate (top) only measures the proportion of 3s and 4s out of all tests taken.
The average score adds up all scores and divides by the number of tests taken.

A couple of things are interesting here. One is that in the OSPI scoring system, there is no change in the outcome measured when a student moves from 1 to 2, or 3 to 4, or if a student moves the other direction. Moving above or below 400 is all that affects the 'score' (percentage). In the Fraser Institute scoring system, movements from 1 to 2 and 3 to 4 (or reverse) do affect the overall average.

Another is that while the percentage of 8th graders passing in 2007 was greater than 2006, their average went down. This must mean that the 2007 students achieved more 3s and fewer 4s than the 2006 8th graders, who had a lower pass rate, by a higher average score.

So, which metric should we use? The average score instrument is blunt. Since a student earning a 400 and another earning a 418 both get 3s, we cast into the same category what are really quite different outcomes. But the percentage passing calculation is even more blunt. For example, 375 is the cutoff between a 1 and 2. For the percentage passing calculation, this difference means nothing. For the average score calculation, it means a lot.

If we were to be rated based on the whole class average score, I will take it as a success to get a student from 374 to 376. That's a 1 moving to a 2. If we use the OSPI scoring system, I will be less motivated (incentivized, as they say in economics) to focus on any such movement. Of course, neither scoring system takes account of a student's movement from 376 to 399 (which is a substantial increase), or 400 to 418.

And we haven't even mentioned whether an 'improvement over prior grade' metric is worth considering as a measure of teacher performance. Of course, one effect of that would be to increase a feeling of competitiveness among teachers...probably not great for school climate.

I'm not trying to make any moral, emotional, financial or spiritual judgments with this. I would like to have a more clear-minded conversation about just what we think we can measure when we talk about rating teachers and students.

No comments: