Wednesday, April 4, 2012

Tests with High Stakes...but for whom?

Most of the discussion of high stakes testing of public school students tends to overlook some fundamental, but crucial, aspects of the whole notion of such tests, especially their connection to teacher evaluations.

The test outcomes have no impact on the students' academic lives, so the stakes are not high for them.  Rather, teachers feel the pressure of the high stakes, and increasingly so when student performance on those tests is going to be a factor in teacher evaluations, as it will undoubtedly be almost everywhere soon.

But any notion that we can rightly and wisely connect scores to evaluations suffers from several serious logical errors or flaws.

First, and most importantly, we are, to be logic of 'sciencey' about it, affirming the consequent--using test score changes as proof of teacher quality.  The whole issue becomes tautological, as the measure of the independent variable (Teacher Impact) is observed as the result on the dependent variable (Test Score). This an invalid form of argument.  To make the Teachers --> Test Score claims valid we must define measures of Teacher Impact prior to and separate from Test Scores, then hypothesize that Higher Teacher Impact scores will cause Higher Test Scores.  Sadly, I don't see that happening.  It's far too easy to simply define Teacher quality by Test Score results.

Second, connecting scores and evaluations implicitly looks upon the students as neutral actors in the whole scenario.  By that I mean we must assume students are merely objects being turned or maneuvered by teachers and that whatever teachers do well or badly transmits fairly directly to students and shows up in their test scores.  If student scores go up, that teacher did a good job.  If not, not.

Or think of it this way.  If something intervenes between the causal agent Teacher Impact and the outcome Student Performance which causes the measurement device Test Score to register something more or less than actual Teacher Impact on Student Performance, then Test Score is a less than fully accurate accounting of Teacher Impact.

We're assuming, in still other words, that Student Performance measured as Test Scores is actually an accurate (if logically invalid) measure of Teacher Impact.

Dubious assumption, as stuff intervenes, no doubt, between Teacher and Test Score.   The question is, how much stuff, and how do we tell what effect it has?

But even if we could figure all that out, we have a third concern, this time about motivation.  The students are the ones taking the tests, and their scores are a measure of accountability for the teacher, not for the student.  The study of economics teaching us nothing if not this:  You have to watch the incentives.  Pay attention to who has incentive to do what things.

As we are constructing the situation, the teacher has a lot of incentive to make sure students do well.  And I assume that's the hope.  Motivate teachers and they'll go motivate and teach students.  But the students don't have much incentive beyond some amorphously constructed internal drive to do well on the test.

To put it in a social sciencey kind of way, the agents (students) whose practical ability we are hoping to see (registered as test score outcomes) don't have appropriate incentives (or not as much as some other actors--teachers--who lack practical ability) to necessarily maximize performance.  Nor is a legal authority available that could compel students to seek maximal outcomes against their own preference to do so or not.

The so-called strategic triangle of compliance (thanks to Ron Mitchell, p. 14 for teaching me that one), which in this case is test score maximization, does not get rightly made in this situation.  The actors with incentives (Teachers) have some, but limited impact on the actors with the practical capacity (Students) to maximize scores, and the actor with 'legal' authority (Parents) to compel the exercise of greater student capacity has been dropped out of the scenario.

Unfortunately, I do not expect these concerns to derail a train with so much steam built up.  Bring on the Test Tricks.

No comments: