beyond the existence proof

In response to Fabio’s defense of nonrepresentative sampling, Sam Lucas sent his paper, “Beyond the Existence Proof,” published last year. Fabio mentions Lucas’s article in his follow-up, but doesn’t really address the claims in the paper. I hadn’t seen it before Sam sent it, but after reading it I think it’s really smart and deserves attention in methods classes and elsewhere.

Lucas draws upon the “lumpiness” of social life:

in the large-dimensioned social space there are concentrations of entities, and sparse locales; some constellations of characteristics are common, others rare; hills and mountains rise from some spots on the social terrain, valleys and ravines mark others.

Because of this lumpiness, sampling strategies that start in a particular spot in social space are likely to stay in that spot:

If an analyst seeks to either count entities or apprehend (or interrogate) social relationships, social world lumpiness will prevent their success unless the study design addresses the threat social world lumpiness poses…. there is no vantage point of unassailably penetrating knowledge; the view from every location is partial, its horizon blocked by some feature(s) emerging out of or into the social plain. Consequently, every location is unable to reveal some aspects or force of the social world that would have been visible from elsewhere.

The only valid inference one can make from a nonrepresentative sample, Lucas argues, is what he calls the “existence proof”: the demonstration that the phenomenon observed exists at all. This position is not about the (IMHO bankrupt) distinction between quantitative and qualitative research. In fact, the article is targeted at qualitative researchers, particularly those doing in-depth interviewing (IDI), who often pay little attention to sampling questions and therefore end up with very detailed existence proofs when (Lucas implies) they could have described social patterning if they had sampled for it appropriately.

I do think Lucas is a bit too dismissive of the importance of existence proofs. In our study of “inferred justification,” it was of theoretical interest simply to demonstrate the existence of a particular mode of political reasoning, even though we couldn’t estimate its prevalence (and in fact we think the prevalence is probably fairly low). I can imagine very important research findings demonstrating that a particular mechanism exists, even if they can’t demonstrate that it is the mechanism or the most common mechanism. But at a minimum such studies ought to be very clear that they are limited to the existence proof; and if research questions merit going beyond it, researchers ought to seek to sample representatively in order to enable transcending the existence proof.

Note that “representative” does not necessarily need to mean “random,” nor “nationally representative.”  A sample is representative insofar as it provides sufficient and sufficiently varied data points to overcome the lumpiness of the social world. Random sampling is one–and the most common–technique for approaching representativeness, but various modifications (stratified, clustered, etc.) are reasonable modifications, and there may well be justifiable approaches that are nonrandom but approach representativeness. And there are various mechanisms for correcting (“weighting”) a sample that is nonrepresentative in known ways, but these always depend on knowledge of the sampling frame based on some other, representative, technique.

“Nationally representative” samples are great–if you’re seeking to represent the social world of the nation. But representativeness implies only that the sample represents the sampling frame, which is in turn a substantive and theoretical question.

It is common in psychology to use nonrepresentative samples (often of college students taking psych classes) for experimental research. The logic is that the important variation in the experiment is in the random assignment to experimental conditions, not variation “brought in” to the experiment from the subjects’ prior lives. Since sociologists don’t do many true experiments, we have tended to care more about the sample’s representativeness. But I think psychologists ought to be worried about this too (and maybe some are). The experimental condition removes the lumpiness of the exposure itself, but it is reasonable to think that propensities for reacting to a given exposure in a particular way are themselves distributed lumpily, in which case experiments on nonrepresentative samples would reduce to existence proofs for the existence of a set of ways of reacting to the exposure. And if you’re interested in processes that can be isolated in homogeneous hamsters, representativeness is achieved not through sampling at all, but by artificially removing all the lumpiness from the social world of interest. But there are few such objects of interest to sociologists.

The bottom line: Lucas is right that nonrepresentative samples limit their studies’ conclusions to existence proofs, but he’s too dismissive of the value of such proofs in certain theoretically-driven cases.


