Category Archives: methods

coding, language, biernacki redux

Dylan Riley’s Contemporary Sociology review (paywall, sorry) of Biernacki’s Reinventing Evidence is out, and an odd review it is. H/T to Dan for noting it and sending it along. The essence of the review: Biernacki is right even though his evidence and argument are wrong. This controversy, along with a nearly diametrically opposed one on topic modeling (continued here) suggest to me that cultural sociology desperately needs a theory of language if we’re going to keep using texts as windows into culture (which, of course, we are). Topic modeling’s approach to language is intentionally atheoretical; Biernacki’s is disingenuously so.

Continue reading

productivity, sexism, or a less sexy explanation.

I apparently attended the same session at the ASA conference as Scott Jaschik yesterday, one on Gender and Work in the Academy. He must have been the guy with press badge who couldn’t wait to fact-check his notes during the Q&A.

The first presenter, Kate Weisshaar from Stanford University, started the session off with a bang with her presentation looking at the glass ceiling in academia, asking whether it was productivity that explained women’s under-representation among the ranks of the tenured (or attrition to lower-ranked programs or out of academia all together). A summary of her findings – and a bit of detail about the session and the session organizer’s response to her presentation – appeared in Inside Higher Ed today. Continue reading

a study in scarlet

RedShirtOvulationGraph
(N=100 on the left, N=24 on the right, one data point per person, observational study)

Andrew Gelman and I exchanged e-mails awhile back after I made his lexicon for a second time. That prompted me to check out his article in Slate about a study published in Psychological Science finding women were more likely to wear red/pink when at “high conception risk,” and then I read the original article.

I don’t want to get into Gelman’s critique, although notably it included whether the authors were correct to measure “high conception risk” as 6-14 days after a woman starts menstruating (see Gelman’s response to authors response about this). And I’m not here to offer an additional critique of my own.

I’m just looking at the graph and marveling at the reported effect size, and inviting you to do the same. Of the women wearing red in this study, 3 out 4 were at high conception risk. Of the women not wearing red, only 2 out of 5 were.*

UPDATE: Self-indulgent even by blog standards, but since I could see using this example again somewhere and it took some effort to reconstruct, I’m going to paste in the cross-tab here: Continue reading

two problems with the facebook study debate

Like much of the sociology blogosphere, I’ve been following the debate over the recent Facebook emotion study pretty closely. (For a quick introduction to the controversy, check out Beth Berman’s post over at Orgtheory.) While I agree that the study is an important marker of what’s coming (and what’s already here), and thus worth our time to debate, I think the overall discussion could be improved by refocusing the debate in two major ways.

Continue reading

does a positive result actually support a hypothesis?

First thing: is it in The Goldilocks Spot? For this, do not look at the result itself, but look at the confidence interval around the result.  Ask yourself two questions:

1. Is the lower bound of the interval large enough that if that were the true effect, we wouldn’t think the result is substantively trivial?

2. Is the upper bound of the interval small enough that if that were the point estimate, we wouldn’t think the result was implausible?

(Of course, from this we move to questions about whether the study was well-designed so that the estimated effects are actually estimates of what we mean to be estimating, etc..  But this seems like a first hurdle for considering whether what is presented as a positive result should be interpreted as possibly such.)

Caveat: Note that this also assumes the hypothesis is a hypothesis that the effect in question is not trivial, and hypotheses may vary in this respect.

ya down with o.t.t.?

My last post raised some comments about one-tailed tests versus two-tailed tests, including a post by LJ Zigerell here.  I’ve returned today to an out-of-control inbox after several days of snorkeling, sea kayaking, and spotting platypuses, so I haven’t given this much new thought.

Whatever philosophical grounding for one-tailed vs. two-tailed test is vitiated by the reality that in practice one-tailed tests are largely invoked so that one can talk about results in a way that one is precluded from doing if held to two-tailed p-values.  Gabriel notes that this is p-hacking, and he’s right.  But it’s p-hacking of the best sort, because it’s right out in the open and doesn’t change the magnitude of the coefficient.*  So it’s vastly preferable to any hidden practice that biases the coefficient upward to get it below the two-tailed p < .05.  

In general, I’ve largely swung to the view that practices that allow people talk about results that are near .05 as providing sort-of evidence for a hypothesis are better than the mischief caused by using .05 as a gatekeeper for whether or not results can get into journals.  What keeps from committing to this position is that I’m not sure if it just changes the situation so that .10 is the gatekeeper.  In any event: if we are sticking to a world of p-values and hypothesis testing, I suspect I would be much happier in which investigators were expected to articulate what would comprise a substantively trivial effect with respect to a hypothesis, and then use a directional test against that. 

* I make this argument as a side point in a conditionally accepted paper, the main point of which will be saved for another day.

should social scientists allow non-directional hypotheses? any examples?

I recently read a paper in which the author hadn’t authored directional hypotheses. They were just of the form that X was expected to be associated with Y. My reaction was just that a non-directional hypothesis was not much of a hypothesis, and I made a comment along the lines of, “you should probably clarify your ideas until you have more of an idea of how X and Y might be associated before you try to test the hypothesis with data.”

This leads me to try to be thinking if I have a more general position about the specificity required for something to be a substantively meaningful social science hypothesis. Does anyone have an example of something in social science where the hypothesis is non-directional (or is just a hypothesis that something “matters”), and that this hypothesis is not trivial? If so, please let me know.

scientific misconduct as choose-your-own-adventure video game!

Kelly pointed to this in the comments on my last post: a CYOA game in which you get to take on different roles and try to prevent a science fraud scandal from happening.

As a connoisseur of CYOA, I don’t think it works that well as a game — too much stuff before and between choices — but the videos have surprisingly high production values and script quality for something like this.

the dilemma of the subordinate methodologist

I’m not sure if the author of this post is a graduate student or undergraduate, but I found it an intriguing statement about the problem younger people interested in methodology can find themselves in while working with established people who are very steeped in conventional practices and productivity. Quote:

One thing that never really comes up when people talk about “Questionable Research Practices,” is what to do when you’re a junior in the field and someone your senior suggests that you partake. [...] I don’t want to engage in what I know is a bad research practice, but I don’t have a choice. I can’t afford to burn bridges when those same bridges are the only things that get me over the water and into a job. (HT: @lakens)

Mostly this is just a statement about power.* But it’s also maybe a statement about what can happen when developments allow the possibility of radical doubt to settle upon a field. Normally a junior person can have methodological doubts, but still think, “Well, these people must know what they are doing, because it’s been successful for them and so ultimately in practice it works, right?” But what happens when you have developments that lead to a lot of people starting to whisper and murmur and talk about how maybe it doesn’t work?

* I mean power in the ordinary sociological sense, not my ongoing obsession with statistical power.

why log?

This is intended as a friendly didactic post, not an addition to my various criticisms of the hurricane name study. But I do use that data and model. Frankly, I suspect I’ll be thinking about the lessons from that study for awhile and using it as a teaching example for years.

I’ve said that substantively it makes more sense to log the measure of hurricane damage, and that the model fits better when you do, even though the key result of their paper is no longer statistically significant. I worry the point may seem arcane or persnickety. So below the jump are a couple of graphs that show the substantive difference that this actually makes over the range of damage observed in their data. (Note the scales of the y-axis.)

Continue reading

talk may be cheap, but meaning is pricey

For those who haven’t yet seen it, there’s a very interesting article by Colin Jerolmack and our own Shamus Khan, along with critiques and rejoinder. The article, “Talk is Cheap,” examines the fact that what people say is not the same as what they do (the problem of “Attitude-Behavior Consistency,” or ABC). They argue that ethnography is therefore the better way to ascertain behavior because ethnographers actually observe behavior itself instead of actors’ often-inaccurate accounts of behavior.  And since sociologists are held to be concerned primarily with social action — an assumption I’ll address below — ethnography (along with, by the way, audit studies such as Quillian and Pager’s) is the better approach.

Continue reading

more on text analysis

Laura Nelson has an excellent discussion of topic modeling on badhessian, which in part takes me to task for my comments on the Poetics issue on topic modeling. Unfortunately the diqus system that handles comments there doesn’t like me, and so has eaten my comments twice. So I’m posting them here, and perhaps someone smarter than I am can make them into a bona fide comment on the site.

Continue reading

topic modeling and a theory of language

The much-anticipated special issue of Poetics devoted to topic modeling in cultural sociology is now available, and it’s a beaut! Props to John Mohr and Petko Bogdanov for editing the special issue, and to all the authors for an exciting group of articles.

There is, quite appropriately, a lot of buzz about the potential of “big data” and quantitative analysis of text, in particular for cultural analysis since so much of culture seems to make its way into text in one form or another.  The articles in the special issue combine into a grand showcase of the possibilities of quantitative analysis of text.  I’ll comment on most of them below. But I think most of them–like much quantitative analysis of text in general–suffer from some theoretical shortcomings. Specifically:

  • with the partial exception of the Mohr, Wagner-Pacifici, Breiger, and Bogdanov article, the studies lack a well-conceptualized theory of language, which leads to some conceptual slippage.
  • there is little attention to the conditions of production of text: whose words, and which words, are written down, archived, and digitized.

Continue reading

that faculty impact “study”

I got a call this morning from the Daily Tar Heel because, while UNC was dead last among the 94 universities covered in the study Kieran has been mocking for its invention of an MIT sociology department, I am apparently the third-most-impactful faculty member in that dubious list. Talk about damning with faint praise.

Continue reading

beyond the existence proof

In response to Fabio’s defense of nonrepresentative sampling, Sam Lucas sent his paper, “Beyond the Existence Proof,” published last year. Fabio mentions Lucas’s article in his follow-up, but doesn’t really address the claims in the paper. I hadn’t seen it before Sam sent it, but after reading it I think it’s really smart and deserves attention in methods classes and elsewhere.

Continue reading

Follow

Get every new post delivered to your Inbox.

Join 915 other followers