stata: roll your own palettes

I realize all the cool kids have switched to R, but if you still work with Stata, you may be interested in some routines I worked up to generate color and line pattern palettes and customize graphs fairly easily with macros and loops. This is useful to me because I am generating line graphs showing the trends for 17 different offense groups. Some preliminary tricks, then the code. Continue reading “stata: roll your own palettes”

cumulative frequency in stata

This would have saved me HOURS of work several months ago.  Where I grew up, the word “sum” means “add things up.” In Stata, it turns out, as a generate function it means what Stata-speakers call a “running sum” — a term I have never heard or used although I do speak about the running total in my bank account — and what I was always taught to call a “cumulative frequency.” The egen sum function I have been using is REALLY named “total”: if you use “sum” in egen, it works as an undocumented alias to total.  I can only guess that this generate function got named “sum” because the obvious abbreviation for a cumulative frequency is an obscene word, but if you look up sum in the Statalist threads, you’ll see that many people assume that the function does what the word means and this mis-naming is a source of endless confusion. In fact, the regular posters (with no apparent sense of irony) call this the most under-utilized function in Stata.  Do you suppose mis-naming it might have something to do with this? Cumsum or, if you are squeamish, runsum, would have been a better name. Even better cross-referencing in the help files would improve documentation. If you search “cumulative” inside Stata, the function does not come up (probably because the word cumulative is never used in its description). The closest you can get is in the second page of the hits, where you’ll get this FAQ: ” How do I tabulate cumulative frequencies?” and a link to: http://www.stata.com/support/faqs/data/tabdisp.html

lots of tables?

Let me be very clear what I’m asking. This is about tables of means, lots of them, 65 – 100 pages of them, integrated with explanatory text. Most social scientists don’t do this kind of stuff, but I do in my “public sociology” work. I need to generate reports that integrate text with LOTS of tables of means cross-classified by race, offense, geographic location, type of statistic. I need to be able to control what headers get put on columns and I need to have every table be output with enough information to be absolutely sure what it is. I have a lot of better things to do with my time than spending hours merging files and reformatting tables.  I really need to automate this stuff. I use Stata, I love Stata, but I’m about ready to kill either myself or Stata. Continue reading “lots of tables?”