Stata resource and question

I’ve been googling around trying to solve an annoying problem using Stata graph bar. In the process, I came across this totally awesome page of user-written graph types.I realize this is appealing only to a certain subset of us.

What I’m trying to do is to control Stata’s graph bar with three over variables. It is an extremely cumbersome little product that will basically do the job with three “over” variables, but has no obvious way to let me fine-tune details like exactly which color to assign to each series of bars, or to be sure that the order of labels in the legend matches up to the order of the bars in the graph if I’ve used the sort option to change the bar order.  You can reset the color of each bar individually, but that isn’t what I want to do: I want each series to have the same color and doing this bar by bar is both tedious and error-prone. Stata knows enough to give each series a distinct color, but does not seem to want to let the user choose the color on a series-wise basis. If you happen to be a Stata graph wonk and want to provide some free consulting, do let me know. I’ve already read through the Statalist archives enough to know that other people also find it difficult to do exactly what they want with graph bar. In case you are wondering, the independent variables are categorical and twoway bar does not help at all.

Edit: I decided to jazz this up with a sample. Note that I put number prefixes on the race labels to get them in the same order as the bars. Otherwise Stata would alphabetize them! I could use the legend order command, but there’s a risk that this would not coincide with the actual categories associated with each bar. I’d like to shade the bars so White is light, Black is dark, and the disparity is medium.

Author: olderwoman

I'm a sociology professor but not only a sociology professor. It isn't hard to figure out my real name if you want to, but I keep it out of this blog because I don't want my name associated with it in a Google search. Although I never write anything in a public forum like a blog that I'd be ashamed to have associated with my name (and you shouldn't either!), it is illegal for me to use my position as a public employee to advance my religious or political views, and the pseudonym helps to preserve the distinction between my public and private identities. The pseudonym also helps to protect the people I may write about in describing public or semi-public events I've been involved with.

13 thoughts on “Stata resource and question”

  1. The command that produced the sample graph is:
    graph bar (asis) r2 if gp1 & area==”” & weight==”” & (rinit==”b” | rinit==”w” | rinit==”d”) , ///
    over(race, sort(rsort)) over(vlab, label(angle(forty_five) labsize(vsmall))) over(modlab, sort(mnum)) ///
    legend( span row(1) size(small) region(lwidth(none))) ytitle(R2) ///
    title(“R2 for regression of prison admission rates on dummy variables for year and state”, size(medsmall))
    graph export “PrisAdmits_year_state_alt.wmf”, replace

    Like

  2. OW,

    I had a comparable problem in generating the graphs for my book. I solved it with the “program” syntax to write was is essentially a wrapper command to do the argument passing.

    Here’s an example (if WordPress eats the formatting I’ll e-mail you). It’s not exactly the same issue (for one thing, I’m making line and area graphs) but it may give you ideas about argument passing, especially since you’ve already figured out the syntax for your sample graph.

    capture program drop songgraph_all
    program define songgraph_all
    
    	local artist      `1'
    	local song        `2'
    	local suffix      `3' /*file suffix to use */ 
    	local subsample2  `4' /*variable to restrict sample to, eg format*/
    	local subsample2x `5' /*subsample2 value (eg, Top 40)*/
    
    	disp "`artist' `song' `suffix' `subsample2' `subsample2x'"
    
    	if "`suffix'"=="" {
    		local suffix="all" 	
    	}
    
    	set more off
    	
    	disp "`suffix'"
    	use final_`suffix'.dta, clear
    	quietly keep if artist=="`artist'" & song=="`song'" 
    
    	if "`subsample2x'"~="" {
    		quietly keep if `subsample2'=="`subsample2x'"
    	}
    	quietly drop if Nt_inc_p>98
    	quietly sum date
    	local mindate=`r(min)'
    	local maxdate=`r(max)'
    	local interval=(`maxdate'-`mindate')/10
    	local interval=round(`interval',7)
    
    	twoway (area Nt_inc date, lwidth(medthick) fcolor(ltbluishgray)) , xtitle("") xmtick(`mindate'(7)`maxdate') xlabel(`mindate'(`interval')`maxdate', labsize(vsmall) angle(forty_five) format(%tdMon_dd,_CCYY)) ytitle(Number of Stations Playing) ytitle(, size(small)) ylabel(, labsize(small) format(%9.2g))
    
    	local artist= lower(subinstr("`artist'"," ","",.))
    	local song= lower(subinstr("`song'"," ","",.))
    
    	if "`subsample2x'"=="" {
    		graph export $images/`artist'_`song'.pdf , replace
    	}
    	else {
    		local subsample2x= subinstr("`subsample2x'"," ","",.)
    		graph export $images/`artist'_`song'_`subsample2x'.pdf , replace
    	}
    end

    Like

  3. Thanks, Gabriel. If I do say so myself, I’m a whiz at passing arguments to graphs and generating zillions of self-labeling graphs, including passing color parameters to them and adjusting graph scales to the range of each variable at a time. But I still can’t figure out how to overcome the graph bar command’s seeming inability to let you change the color of a bar series defined by the categories in an over(var) group, rather than one bar at a time by bar number.

    Like

    1. I guess my point is that if you write a wrapper to do the argument passing then the command itself can be really long and repetitive. For instance, you could do it one bar at a time by bar number since the wrapper would handle this for you.

      Unfortunately I don’t have much experience w bar graphs so I can’t give any more specific advice than that.

      good luck

      Like

  4. I’ll be no help on how to make the graphs. But now that you’ve piqued our curiosity with the data, could you tell us how all those variables are defined?

    Like

  5. MB: Well, title is supposed to tell all. Each bar is an R2 from a regression equation. The dependent variables are the R2s of the rates of entering prison. These rates are categorized as total (all admits), new sentences, and revocations, and by race (White, Black, and the White/Black disparity). The independent variables are dummies for each year 1985-2002, for each state, and the combination of state and year dummies. The point of the graph is that there are huge fixed effects of place (state) and additional effects of time (year), and that there really isn’t much variation left over to be explained as a within-state effect that isn’t part of the national time trend (combined R2s are over .7). Also the graph indicates that Black prison sentences had an especially strong year effect and a correspondingly lower place effect.

    The idea is that this graphic should be easier to absorb visually than a table of numbers. But making it easy to “read” visually involves using order and color to make the story pop out to the eye.

    Like

  6. I would bet someone on Statalist would know if this is possible. My best guess is that it is possible with a custom scheme, assuming you wanted the colors in the same order. Unfortunately, schemes are not well documented, presumably because the Stata folk feel that their time is better spent doing things of interest to more people.

    Like

  7. Neal is correct, schemes are the best way to do this I believe.

    Schemes determine the defaults of how graphs are displayed. There are a whole bunch of options in there that you probably don’t need to mess with, especially since you only want to change the color defaults. If I understand what you are trying to do (big if), I think that it is relatively easy to do.

    It looks like you are using the s2mono scheme. If so, open up a new file in the dofile editor or text editor. On the first line, write:

    #include s2mono

    Then, after that, determine the colors that you want for each of your series (in this case it looks like colors). Once you do so, write on the next three lines:

    color p1     green
    color p2     red
    color p3     blue
    

    You can, of course, change the colors to any color using the colorstyle syntax including defining RGB colors or colors by intensity. You can define up to 15 colors if you might have different colors in the future.

    Now, we need to save that file somewhere on your ado path. Probably the best place is your PERSONAL path (if you don’t know where that is, type `c(sysdir_personal)' in the command window). Save the file as something following the format:

    scheme-schemename.scheme

    (So you could save it as scheme-s2monobars.scheme). Now, add the option scheme(s2monobars) to your graph bar command (or substitute what you decide to call the scheme for the option) and it should use your new defaults.

    Hope that helps and actually responds to what you are looking for.

    Like

  8. Thanks, Mike, it does. Sort of. I knew it would be schemes, but my last attempt to create a custom scheme failed to work properly, so it seemed like a waste of time to invest in that. The manual says you can create schemes but not how to do it. But at least now I have some sample code to play with.

    Like

    1. A lot of Stata is itself written in Stata as plain text scripts (as compared to compiled code) so if all else fails you can get example code by reading these scripts. On a Mac the schemes are a bunch of files with the “.scheme” extension in the directory “/Applications/Stata/ado/base/s/”. Not sure where they’d be in Windows, but you can ask Stata with the “sysdir list” command.

      Like

    2. Olderwoman, I think that StataCorp improved scheme support in the past couple of versions because I also remember looking at schemes previously and having a great deal of difficulty. I’m not sure when they included the #include command, but it seems like that makes a huge difference because you don’t need to recreate the whole file. In addition to my code above, there is a decent example code for a scheme here.

      Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s