stata 12!

Stata 12 has been announced. The three biggest additions for sociologists are probably:

1. Structural equation modeling, including (sigh) a path-diagram drawing module for folks who cannot figure out what their diagram implies in terms of a set of linear equations. Includes FIML estimation for missing data, which is the SEM counterpart to multiple imputation in terms of missing-data techniques that some folks wishfully believe have mystical powers to surmount fundamental data limitations on inference.

2. The -margins- command added in Stata 11 will be accompanied now by a graphing program, so between the two of those commands one should be able to do just about anything one wants post-estimation with predicted values. -Margins- already is a mighty command, with the caveat is that it is extremely easy to use it to generate results that are not quite what you thought you were getting.

3. A -contrast- command that allows you to easily generate the significance tests for various kinds of implied contrasts in a model, so you don’t have to re-estimate models with different dummy variable or interaction specifications in order to get all the significance tests of interest.

The analogy in my mind is that Stata is to the iPhone as R is to Android, as far as social science data analysis goes. I guess SAS would be BlackBerry, insofar as it’s dated and propped up by a strong lock-in among government employees. And SPSS is a Nokia phone that has a slick interface for dialing your friends but requires you to push dozens of extra buttons in a non-intuitive sequence if you want to call anyone new.

[Update: Gabriel beat me to posting about this. He’s also enthusiastic about the addition to contour plots and the ability to export graphs as PDF.]

Author: jeremy

I am the Ethel and John Lindgren Professor of Sociology and a Faculty Fellow in the Institute for Policy Research at Northwestern University.

14 thoughts on “stata 12!”

  1. Includes FIML estimation for missing data, which is the SEM counterpart to multiple imputation in terms of missing-data techniques that some folks wishfully believe have mystical powers to surmount fundamental data limitations on inference.

    I remember talking to a colleague at Arizona about this a few years ago. My lasting impression is that this method seemed to do something that I’d previously been led to understand was straightforwardly impossible—as I recall the idea was to use all the casewise data even if cases were missing observations on some variables, but without actually imputing any missing values prior to estimation of the model. My colleague tried to explain the guts of it to me but failed because they had only had a finite amount of time available whereas I had an infinite supply of blank looks.

    Like

    1. Actually, a FIML estimator is pretty straightforward and requires the same assumptions as MI. Allison (2002) covers it in his Sage book and Enders’ (2010) recent text on missing data provides a nice numerical walkthrough of how the estimator works if you’re interested. It’s very nice to see it implemented in Stata.

      Liked by 1 person

      1. The premise with multiple imputation is you take a series of random draws for each missing data point from a probability distribution of predicted values. As long as you are making random draws from a distribution, hey, why not use the distribution itself if it’s computationally tractable? If your imputation method is ML, it’d be just the same as what you’d get from MI you drew an infinite number of imputations.

        At least that’s the three sentence version of how I’ve understood it, but anybody should feel free to correct me if I’m wrong.

        Like

  2. “My colleague tried to explain the guts of it to me but failed because they had only had a finite amount of time available whereas I had an infinite supply of blank looks.”

    Stealing that for the next time I have to talk with an econometrician.

    Like

  3. While I’m happy Stata has SEM, I wouldn’t use the SEM component for teaching SEM to beginners that’s for sure. The path drawing module and the syntax is a bit too simplistic to teach with and could allow for some big errors to occur when students are trying relentlessly to “get a picture to come up” from their data. But, it seems better than AMOS…I guess I’ll have to play with things when I update.

    All in all though, some good additions to a great stats program.

    Like

  4. The premise with multiple imputation is you take a series of random draws for each missing data point from a probability distribution of predicted values. As long as you are making random draws from a distribution, hey, why not use the distribution itself if it’s computationally tractable? If your imputation method is ML, it’d be just the same as what you’d get from MI you drew an infinite number of imputations.

    Clever. I look forward to a generalization of this approach that dispenses with the need for data altogether. In the meantime, I can imagine frustrated graduate students making extensive use of the related FML estimator.

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s