Category Archives: methods

what does ‘why’ mean?

A couple of weeks ago I got in a friendly back-and-forth on Twitter with my friend and colleague Daniel Kreiss. Daniel was annoyed by this article, which purports to reveal why Mitt Romney chose Paul Ryan to be his running mate by deploying median-voter theory. Daniel’s frustration was this:


Here’s the record of our conversation. More thoughts below the break.

Continue reading

ask a scatterbrain: managing workflow.

As I have admitted before, I am a terrible electronic file-keeper. If I was to count up the minutes I have wasted in the last 15 years searching for files that should have been easy to find or typing and retyping Stata code that would have (and should have) been a simple do-file or doing web searches for things that I read that I thought I wanted to include in lectures or powerpoints or articles but couldn’t place, I fear I would discover many months of my life wasted as a result of my organizational ineptitude.

For a long while, these bad habits only affected me (and the occasional collaborator). It was my wasted time and effort. Now, though, expectations are changing and this type of disorganization can make or break a career. I think about my dissertation data and related files, strewn about floppy disks and disparate folders, and I feel both shame and fear. Continue reading

big data hubris

A not-very-important, yet instructive, series of events on Friday offers a cautionary tale about the allure of big data and the fashionable mistrust of local knowledge.

Continue reading

coding, language, biernacki redux

Dylan Riley’s Contemporary Sociology review (paywall, sorry) of Biernacki’s Reinventing Evidence is out, and an odd review it is. H/T to Dan for noting it and sending it along. The essence of the review: Biernacki is right even though his evidence and argument are wrong. This controversy, along with a nearly diametrically opposed one on topic modeling (continued here) suggest to me that cultural sociology desperately needs a theory of language if we’re going to keep using texts as windows into culture (which, of course, we are). Topic modeling’s approach to language is intentionally atheoretical; Biernacki’s is disingenuously so.

Continue reading

productivity, sexism, or a less sexy explanation.

I apparently attended the same session at the ASA conference as Scott Jaschik yesterday, one on Gender and Work in the Academy. He must have been the guy with press badge who couldn’t wait to fact-check his notes during the Q&A.

The first presenter, Kate Weisshaar from Stanford University, started the session off with a bang with her presentation looking at the glass ceiling in academia, asking whether it was productivity that explained women’s under-representation among the ranks of the tenured (or attrition to lower-ranked programs or out of academia all together). A summary of her findings – and a bit of detail about the session and the session organizer’s response to her presentation – appeared in Inside Higher Ed today. Continue reading

a study in scarlet

(N=100 on the left, N=24 on the right, one data point per person, observational study)

Andrew Gelman and I exchanged e-mails awhile back after I made his lexicon for a second time. That prompted me to check out his article in Slate about a study published in Psychological Science finding women were more likely to wear red/pink when at “high conception risk,” and then I read the original article.

I don’t want to get into Gelman’s critique, although notably it included whether the authors were correct to measure “high conception risk” as 6-14 days after a woman starts menstruating (see Gelman’s response to authors response about this). And I’m not here to offer an additional critique of my own.

I’m just looking at the graph and marveling at the reported effect size, and inviting you to do the same. Of the women wearing red in this study, 3 out 4 were at high conception risk. Of the women not wearing red, only 2 out of 5 were.*

UPDATE: Self-indulgent even by blog standards, but since I could see using this example again somewhere and it took some effort to reconstruct, I’m going to paste in the cross-tab here: Continue reading

two problems with the facebook study debate

Like much of the sociology blogosphere, I’ve been following the debate over the recent Facebook emotion study pretty closely. (For a quick introduction to the controversy, check out Beth Berman’s post over at Orgtheory.) While I agree that the study is an important marker of what’s coming (and what’s already here), and thus worth our time to debate, I think the overall discussion could be improved by refocusing the debate in two major ways.

Continue reading

does a positive result actually support a hypothesis?

First thing: is it in The Goldilocks Spot? For this, do not look at the result itself, but look at the confidence interval around the result.  Ask yourself two questions:

1. Is the lower bound of the interval large enough that if that were the true effect, we wouldn’t think the result is substantively trivial?

2. Is the upper bound of the interval small enough that if that were the point estimate, we wouldn’t think the result was implausible?

(Of course, from this we move to questions about whether the study was well-designed so that the estimated effects are actually estimates of what we mean to be estimating, etc..  But this seems like a first hurdle for considering whether what is presented as a positive result should be interpreted as possibly such.)

Caveat: Note that this also assumes the hypothesis is a hypothesis that the effect in question is not trivial, and hypotheses may vary in this respect.

ya down with o.t.t.?

My last post raised some comments about one-tailed tests versus two-tailed tests, including a post by LJ Zigerell here.  I’ve returned today to an out-of-control inbox after several days of snorkeling, sea kayaking, and spotting platypuses, so I haven’t given this much new thought.

Whatever philosophical grounding for one-tailed vs. two-tailed test is vitiated by the reality that in practice one-tailed tests are largely invoked so that one can talk about results in a way that one is precluded from doing if held to two-tailed p-values.  Gabriel notes that this is p-hacking, and he’s right.  But it’s p-hacking of the best sort, because it’s right out in the open and doesn’t change the magnitude of the coefficient.*  So it’s vastly preferable to any hidden practice that biases the coefficient upward to get it below the two-tailed p < .05.  

In general, I’ve largely swung to the view that practices that allow people talk about results that are near .05 as providing sort-of evidence for a hypothesis are better than the mischief caused by using .05 as a gatekeeper for whether or not results can get into journals.  What keeps from committing to this position is that I’m not sure if it just changes the situation so that .10 is the gatekeeper.  In any event: if we are sticking to a world of p-values and hypothesis testing, I suspect I would be much happier in which investigators were expected to articulate what would comprise a substantively trivial effect with respect to a hypothesis, and then use a directional test against that. 

* I make this argument as a side point in a conditionally accepted paper, the main point of which will be saved for another day.

should social scientists allow non-directional hypotheses? any examples?

I recently read a paper in which the author hadn’t authored directional hypotheses. They were just of the form that X was expected to be associated with Y. My reaction was just that a non-directional hypothesis was not much of a hypothesis, and I made a comment along the lines of, “you should probably clarify your ideas until you have more of an idea of how X and Y might be associated before you try to test the hypothesis with data.”

This leads me to try to be thinking if I have a more general position about the specificity required for something to be a substantively meaningful social science hypothesis. Does anyone have an example of something in social science where the hypothesis is non-directional (or is just a hypothesis that something “matters”), and that this hypothesis is not trivial? If so, please let me know.

scientific misconduct as choose-your-own-adventure video game!

Kelly pointed to this in the comments on my last post: a CYOA game in which you get to take on different roles and try to prevent a science fraud scandal from happening.

As a connoisseur of CYOA, I don’t think it works that well as a game — too much stuff before and between choices — but the videos have surprisingly high production values and script quality for something like this.

the dilemma of the subordinate methodologist

I’m not sure if the author of this post is a graduate student or undergraduate, but I found it an intriguing statement about the problem younger people interested in methodology can find themselves in while working with established people who are very steeped in conventional practices and productivity. Quote:

One thing that never really comes up when people talk about “Questionable Research Practices,” is what to do when you’re a junior in the field and someone your senior suggests that you partake. […] I don’t want to engage in what I know is a bad research practice, but I don’t have a choice. I can’t afford to burn bridges when those same bridges are the only things that get me over the water and into a job. (HT: @lakens)

Mostly this is just a statement about power.* But it’s also maybe a statement about what can happen when developments allow the possibility of radical doubt to settle upon a field. Normally a junior person can have methodological doubts, but still think, “Well, these people must know what they are doing, because it’s been successful for them and so ultimately in practice it works, right?” But what happens when you have developments that lead to a lot of people starting to whisper and murmur and talk about how maybe it doesn’t work?

* I mean power in the ordinary sociological sense, not my ongoing obsession with statistical power.

why log?

This is intended as a friendly didactic post, not an addition to my various criticisms of the hurricane name study. But I do use that data and model. Frankly, I suspect I’ll be thinking about the lessons from that study for awhile and using it as a teaching example for years.

I’ve said that substantively it makes more sense to log the measure of hurricane damage, and that the model fits better when you do, even though the key result of their paper is no longer statistically significant. I worry the point may seem arcane or persnickety. So below the jump are a couple of graphs that show the substantive difference that this actually makes over the range of damage observed in their data. (Note the scales of the y-axis.)

Continue reading

talk may be cheap, but meaning is pricey

For those who haven’t yet seen it, there’s a very interesting article by Colin Jerolmack and our own Shamus Khan, along with critiques and rejoinder. The article, “Talk is Cheap,” examines the fact that what people say is not the same as what they do (the problem of “Attitude-Behavior Consistency,” or ABC). They argue that ethnography is therefore the better way to ascertain behavior because ethnographers actually observe behavior itself instead of actors’ often-inaccurate accounts of behavior.  And since sociologists are held to be concerned primarily with social action — an assumption I’ll address below — ethnography (along with, by the way, audit studies such as Quillian and Pager’s) is the better approach.

Continue reading

more on text analysis

Laura Nelson has an excellent discussion of topic modeling on badhessian, which in part takes me to task for my comments on the Poetics issue on topic modeling. Unfortunately the diqus system that handles comments there doesn’t like me, and so has eaten my comments twice. So I’m posting them here, and perhaps someone smarter than I am can make them into a bona fide comment on the site.

Continue reading


Get every new post delivered to your Inbox.

Join 2,417 other followers