Wednesday, February 25, 2015

The Hubris of Tautology

Interesting story on NPR's Morning Edition today.  People are using Big Data from neuroscience to identify how well job applicants fit with a company and its work.  There's something naggingly unclear in all this, though.  

The proponents do things like collect information about how a person plays a video game in order to discern personality traits and work patterns.  Presumably, the "way" the person plays enables the ability to reasonably and reliably identify and label the patterns and connect them to the personality traits.  No doubt, the play patterns are correlated with personality traits by their statistical connection in the (big) data results from everyone else who has done these tests. And we can rely on those connections in the data (and the algorithms created therefrom) because, as Wharton Prof. Berkeley Dietvorst points out in another NPR piece,  

Algorithms are consistent. If you give algorithms the same information to make a forecast, they'll produce the same forecast every single time, where humans are not reliable. 

Herein begins the hubris.  As devoted to data as we are, we seem to forget that somebody (the analyst, the statistician, etc.) "figures out" what the supposed correlations reflect, what they tell us about reasons for things.  It also takes a human analyst to see when the correlations don't make sense.  Take the case of the analytics trying to connect job and employment related words appearing on Twitter to the unemployment rate.  The observations were thrown wildly off by a huge spike in the word Jobs, which erupted when Apple's Steve Jobs died. 

In the case of video game playing and personality, to be able to discern that someone is optimistic, introspective and resilient from how they play an Angry Birds-like game requires that we assume the mere correlation between play patterns and characteristics is reliable.  It's easy enough to imagine that there are only so many patterns to play the game, so connecting those patterns (which are likely less numerous than the variety of personality characteristics) to personality might just be an exercise in reductionism as much as--or more than--any newsstand magazine personality test, and neither is serious personality evaluation.  

The data might connect a certain play pattern with what we think of as optimistic and resilient, but could more than one play pattern so correlate?  Could an optimistic person--say, who simply had a different level and quality of interest in the game--play by a different pattern, but still be reliably identified as an optimistic person?  Do we presume that optimistic people would only play according to the patterns expected in the data (effectively rendering outliers--sorry, Malcolm Gladwell--as non-entities)?

If we answer YES to these questions, we need to consider the prospect that what we're finding in the data might be reduced too far, and rendering tautologies, not insights.  As Frederick Morgeson, an organizational psychology expert at Michigan State University points out,


whether the claims that [proponents of these practices] are making are in fact true and they're measuring what they say they're measuring — that is a question that can really only be answered by research.


Good question.  For instance, do the correlations take account of differences among people in prior video game exposure--previous play, interest, willingness, etc.?  In other words, since everyone has different antecedent video game exposure (including NONE), can we really assume that video game play is a reliable proxy for really finding optimistic resilient people?  If there is any way at all that optimistic people might play different from what the correlated data expect, then this data do not show a reliable connection.  In other words, the unspoken assumption is that the play-optimism connection is the only (or most highly) revealed connection.  

Moreover, how were the characteristics "optimistic," "resilient" and "introspective" identified in the first place?  What piece of data represents each of those characteristic?  Who coded the characteristics in that way Based on what personality measurements?  And what further research would clarify this? One suspects the answers to these questions cast doubt on whether all the data are as clean and clear as we presume.

To put Morgeson's observation another way, are we really sure we're measuring what we think we're measuring?

This question is particularly vexing when it comes to standardized testing of students.  We can be sure that we are measuring what a student did on that particular test, but what that outcome stands for or represents is less clear.

Search this blog for "standard" and you can find the various concerns raised here before.  Suffice to say, assuming that a standardized test score correlates to something called "learning" or "education" or "skills" is, at least in some measure, an assumption.  But we have made a tautology of the matter, presuming that the data point is, in fact, the evidence.  We have come to trust the data more than people.  A set of test scores is more esteemed than a sentient adult's observations of that student.  

And, of course, the scores too easily absolve sentient adults of responsibility.  If the test scores are the measure, then all we have to do is show the right outcomes on the test and we're good to go.  I would be deemed a good teacher if I could get a few more of my students to pass than last year.  The data we care about won't show if some of those students' scores went down (but stayed above passing), and they won't show if I caused even passing students to grow disinterested in reading, or writing or academic work in general.  The data don't lie, that's for sure.  The truth they tell, however, may not be as robust as we wish.

No comments: