It is happening again. Lawyers who don’t know anything about data analysis make a request (often an official open records request) for information from agencies. Because I have gotten a reputation for analyzing public data and making it reveal previously-unseen patterns and I’m part of the commission or board or task force, they ask me to analyze it. This time the open records request vaguely asked for information on the racial breakouts of arrests and traffic stops and was sent to the two dozen law enforcement agencies in the county. Some agencies use the same software and gave the job to the same IT guy who responded with multi-page spreadsheets that are all in the same format, which includes the breakouts by race for 64 offense groups and five citation/stop categories. That’s not so bad, except of course that there’s several hours of work needed to merge the spreadsheets into a common dataset. And there aren’t any municipal-level census estimates, so there is the minor problem of how to turn this stuff into rates. But this isn’t the only thing that came in. The two biggest agencies responded with dumps of all charges onto CDs: to turn them into the appropriate counts, you first need to collapse the charges down to incidents (as the same person can have multiple charges in an arrest or traffic stop) using the date and time of the contact and the date of birth of the offender, following some sort of protocol for which offense to treat as the “most serious” offense, and then collapse the zillions of specific offenses into a smaller number of categories. Of course there is no crosstalk file for linking the specific offenses into either the 64 standard categories used by the agencies that used the same format nor the Uniform Crime Reports categories nor the severity code, nor did the agencies include these fields in their data dump, even though they must have that information for their own needs. The most passive aggressive agency did not even include the offense description field, just the statute number. The rest of the agencies responded in a wide variety of ways, including PDFs of pie charts of the racial breakouts of traffic stops, counts by race that summed across arrests and traffic stops, or emails that said “everyone we arrested was white” (no counts given). But the most amazing was an agency that actually sent a file that is a VIDEO with slides of their report that rotate in 3-D space and a voice-over describing was was on each page. I am not kidding! A freaking video.
The lawyers don’t understand why I’m saying it isn’t worth my time to try to “analyze” this mess. They say, “We asked for it, it will look bad if we don’t include the results in the report.” And they complain because the information I do have for them is “out of date.” Arrgh.