wired pronounces: google trumps science

Over on Gelman’s blog there’s an interesting discussion of a silly Wired article proclaiming that the “petabyte age” makes theory and hypotheses obsolete:

Peter Norvig, Google’s research director, offered an update to George Box’s maxim: “All models are wrong, and increasingly you can succeed without them.”

This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.

I assume all Scattertrons are on-board with thinking this is an incredibly naive position. At the moment I’m particularly impressed with Pat Sullivan’s “Spurious Genetic Associations” piece, which demonstrates precisely why numbers don’t speak for themselves. More data means… more plausible causal pathways. Moreover, the number of plausible causal pathways increases exponentially (or something like that) with the increase in the amount of data.

Here’s my concern, though: the google mania (demonstrated by Conley’s piece, which we’ve been bashing, as well as elsewhere in pop culture) does seem like a potent anti-intellectual trend: why think when you can count high? I found this to be the case, too, in the popular book The Wisdom of Crowds, for which Google is an important example. What that book finds exciting about Google is its apparent asystematicity; but this “works” only for particular, web-ish definitions of “works.” So… in the popular mind, how do we make this case?

Author: andrewperrin

University of North Carolina, Chapel Hill

3 thoughts on “wired pronounces: google trumps science”

  1. But is google mania really anti-intellectual, or merely redefining the space of legitimate expertise?

    The Wisdom of Crowds is a better example, as it tends to create the illusion in readers that somehow a bunch of people who know nothing do better than a bunch of experts, which is (of course) not the finding or even the entire argument (as I understand it). But this is more about one kind of expertise (data manipulation, statistics, and a certain kind of observational skill via the web) replacing other sorts of expertise. I think the right parallel here is Freakonomics – one kind of expert, skilled in a particularly persuasive technique, branching out to wherever there is data, and using common sense + data + data manipulation skills to try and trump existing expertise.

    Perhaps one plausible answer is to master the new sorts of data and show how the old questions and methods are still valuable?

    Like

  2. Dan, I think John Timmer at Ars Technica would agree with you about mastering the new sorts of data:

    http://arstechnica.com/news.ars/post/20080625-why-the-cloud-cannot-obscure-the-scientific-method.html

    … as do I. I really think that the increasing interaction of humans with computers is producing a vast quantity of data that we in the social sciences neglect, and that the correlations of the sort they’re talking about can help us find connections that don’t fit in with the way we think about things.

    Like

  3. The Freakonomics parallel Dan makes is apt, but it should be noted that “common sense” in that case is a particular brand of orthodox economic theorizing — all phenomena reduce to Chicago-school price theory. That partly answers his question, in that a lot of empirical work on big data sets *is* theoretically conventional.

    Regarding the Mad Slave’s comment, I’m curious as to how much data is deliberately neglected versus unavailable to most researchers (e.g. for proprietary reasons). Maybe the lucky duckies are litigation experts, who get the mountains of data dropped on them via discovery.

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.