okcupid is the new facebook? more on the politics of algorithmic manipulation

OK Cupid’s excellent blog just posted the results of a set of experiments they conducted on their own users. The post is framed in explicit defense of similar practices at Facebook:

We noticed recently that people didn’t like it when Facebook “experimented” with their news feed. Even the FTC is getting involved. But guess what, everybody: if you use the Internet, you’re the subject of hundreds of experiments at any given time, on every site. That’s how websites work.

In this post, I want to engage with the above argument in the context of OKC’s own manipulation.
junior theorists symposium 2014

The final schedule for the 2014 Junior Theorists Symposium has just been released. If you’re going to be in the Bay Area the day before ASA (Friday, August 15), and have not already committed to one of the other pre-conferences, stop by 60 Evans Hall at the University of California (Berkeley) to see some amazing junior theory in action! If you have any questions, or would like to RSVP, just send an email to Jordanna Matlon and myself at juniortheorists@gmail.com.

all persons are fictional

In the wake of the Hobby Lobby decisions, there have been renewed discussions of corporate personhood. The argument is relatively simple: the 19th century Supreme Court made a mistake when it created the legal fiction that corporations are persons. I don’t want to get into that argument here. Instead, I want to make a slightly different argument: all persons are fictions.

experimental vs. statistical replication

In the context of all of the debates about replication going on across the blogs, it might be useful to introduce a distinction: experimental vs. statistical replication.* Experimental replication is the more obvious kind: can we run a new experiment using the same methods and produce a substantially similar result? Statistical replication, on the other hand, asks, can we take the exact same data, run the same or similar statistical models, and reproduce the reported results? In other words, experimental replication is about generalizability, while statistical replication is about data manipulation and model specification.

On the one hand, sociology, economics, and political science all have ongoing issues with statistical replication. The big Reinhart and Rogoff controversy was the result of an attempt to replicate a statistical finding that revealed some unreported shenanigans in how cases were weighted, and that some cases were simply dropped through error. Gary King’s work on improving replication in political science aims at making this kind of replication easier, and even turning it into a standard part of the graduate curriculum. Similarly, I believe the UMass paper (failing to) replicate Reinhart and Rogoff emerged out of a econometrics class assignment (e.g.) that required students to statistically replicate a published finding.

On the other hand, psychology seems to have a big problem with experimental replication. Here the concerns are less about model specification (as the models are often simple, bivariate relationships) or data coding, but rather about implausibly large effects and “the file drawer problem” where published results are biased towards significance (which in turn makes replications much more likely to produce null findings).

Both of these kinds of replication are clearly important, but they present somewhat different issues. For example, Mitchell’s concern that replication will be incompetently performed and thus produce null findings when real effects exist makes less sense in the context of statistical replication where the choices made by the replicator can be reported transparently, and the data are shared by all researchers. So, as an attempt at an intervention, I propose we try to make clear when we’re talking about experimental replication vs. statistical replication, or if we really mean both. Perhaps we might even call the second kind of replication something else like “statistical reproduction”** in order to highlight that the attempt to reproduce the findings are not based on new data.

What do you all think?

* H/T Sasha Killewald for a conversation about different kinds of replication that sparked this post.
** Think “artistic reproduction” – can I repaint the same painting? Can I re-run the same models and data and produce the same results?

two problems with the facebook study debate

Like much of the sociology blogosphere, I’ve been following the debate over the recent Facebook emotion study pretty closely. (For a quick introduction to the controversy, check out Beth Berman’s post over at Orgtheory.) While I agree that the study is an important marker of what’s coming (and what’s already here), and thus worth our time to debate, I think the overall discussion could be improved by refocusing the debate in two major ways.

