Saturday, March 6, 2010

Research Marches On

538 just did some reminiscing about an old graph showing polling house effects. Andrew Gelman notes a graph from 1995, with which he helped out, noting effects from the 1988 Presidential election. Aside from the general ugliness of the graph itself, Gelman notes two things about it:

2. The time lag. This is a graph of polls from 1988, and it's appearing in an article published in 1995. A far cry from the instantaneous reporting in the fivethirtyeight-o-sphere. And, believe me, we spent a huge amount of time cleaning the data in those polls (which we used for our 1993 paper on why are campaigns so variable etc).

3. This article from 1995 represented a lot of effort, a collaboration between a journalist, a statistician, and a political scientist, and was published in a peer-reviewed journal. Nowadays, something similar can be done by a college student and posted on the web. Progress, for sure.

Progress indeed. The guy who won the corresponding election had served his entire term, gotten voted out in favor of his successor, and the successor was in the second half of his first term, before that graph saw the light of day, when now it can be done the same day and added to as mountainous a pile of raw data as you like, sometimes accompanied by calls for more, perhaps even investigations for more.

And as for progress, I can attest to the gigantic leap in data availability myself. I saw its march over the course of my formal schooling.

When I started out in elementary school, compiling data for a report meant contant shuttling back and forth to the library, having a possibly-outdated World Book on hand, a lot of old-school grunt work, often for naught. It was entirely possible that you would have an idea what you wanted to write, but couldn't make it to a place you could reference, so you'd either have to leave it out or try and explain to the teacher what happened to your sources.

We learned computers in middle school, and while we'd still have to do some shuttling back and forth, that was simply going to school and using their Internet access. However, the Internet was still young, and the amount of information available was somewhat limited (not helped by a school-initiated firewall).

Eventually, we got our own computer, and we had some early jitters. (True story: When we got it turned on, we attempted to first go to I don't know how we wound up at a Swedish porn site. I don't want to know either.) But we got better at using the Internet, and over time, so did everyone else. The Internet grew and evolved in turn, and so did the quantity of information and the efficiency in finding it. If you asked me in 1995 to do a report on Mali, just in general, I don't know whether I'd be able to pull it off. But if you asked me now to look up a Malian soccer team? Done. And here's their major rival just for kicks.

In college, I was invited to a luncheon that a sampling of other students and faculty were attending, and one of the school administrators asked me how research has evolved. The answer I gave was largely what I just said, with the caveat that, with all of that information out there, there's also a bunch of crap. There's people who don't do their research, people who speak ignorantly, people who deliberately seek to misinform. You need to be able to cut through all of that.

That's the challenge now. The question used to be, could you find the information. Now the question is, what information are you going to use.

No comments: