This tutorial goes over how to conduct a simple outcome evaluation using my work, the Weighted Displacement Difference (WDD) test (Wheeler and Ratcliffe, 2018. We are going to be doing some simplified analysis of the effect of community prosecution hot spots areas in Dallas (see Worrall and Wheeler, 2019 for the full analysis).
To follow along, download the data (PFA_Data.csv
and SpreadSheet_WDD.xlsx
) from ELearning, or from this dropbox link.
In this part, we are going to create several variables that help us later on when we construct our pivot table. First we are going to make a variable (using the incident level data), that identifies what incidents are associated with our control areas and treatment areas. Second, we are going to make a variable that identifies whether the incident took place before the intervention or after the intervention.
Our treatment area is going to be the Forest/Audelia PFA area, and our control area is going to be the TAAG area Forest/Dennis. I choose Forest/Dennis as a control area because it had similar crime counts to Audelia, and the nature of the area was similar (commercial), but it is on the other side of the city.
So open up the PFA_Data.csv
data file, and turn the data into a table. The scroll all the way to the right at the end of the variable list.
Double click the right side of the column AE and AF (TargetAreaActionGrids, the long name for TAAG, and Community) to make sure the fields do not hide any of the text. In the TAAG field, filter to the area named Forest Dennis
.
This TAAG area is going to be our control area. Select the top cell that is viewable in AJ, and we are going to type an IF statement to identify this area as our control. Type in the formula =IF([@TargetAreaActionGrids]="Forest Dennis","Control","????")
. And you should see the column auto-update. (Make sure to type Forest Dennis
exactly as it should be, otherwise the equal condition will not be satisfied.) What this IF statement does is cases that meet the equality condition, it populates the text field with the word “Control”. For those not meeting the condition, it populates the field with the text of question marks.
You should now see that the field now states “Control” for all of the areas that Forest Dennis is listed as a TAAG area. Now clear that filter in the TargetAreaActionGrids field.
Our PFA treated areas are located in the Community field. In the dropdown select ForestAudelia_PFA.
Now instead of the ????
, we are going to nest an additional if statement in column AJ. So inplace of the question marks, type in a second if statement identifying those Community PFA Forest/Audelia areas as treated locations. In the end my formula looks like =IF([@TargetAreaActionGrids]="Forest Dennis","Control",IF([@Community]="ForestAudelia_PFA","Treatment","Other"))
.
Go ahead and rename that field column as AreaType. Then create a new column that says InterventionDate in column AK. In that field type the formula =DATEVALUE("1/1/2016")
. This is when the intervention started in the treatment area, and will just look like a number currently (42370), not a date. To fix this, right click on AK and select Format Cells.
Then select date, and pick a date format, then click OK. You should see the cells are the correct date, the January 1st 2016.
Now we are going to make another variable that identifies whether the incident occurred after the intervention took place, or before it took place. Name this field “IntTime”, and then write an IF statement to see if the date the incident was reported (column V) was before 1/1/2016, label it Pre. If not, label it Post. (You can do this multiple ways, but the screenshot below shows how I did it in this instance.)
Now we have all of our ingredients to get the statistics we want to calculate the WDD. Go ahead and insert a pivot table into your spreadsheet (saving your work now is a good idea as well). In your pivot table, drag AreaType and IntTime into the rows area. In the Row Labels of the pivot area, select the dropdown and click off the Other category.
Now for the columns, select ucr_redString. Then drag this field into the values as well (which will default to counting the items). Then select the dropdown for the Column Labels, and deselect (ARSON, MANSLAUGHTER, MURDER, and PART 2). Arsons and murders are too rare to do analysis on, and we are not interested in Part 2.
Now we have all the data we need to fill in the WDD calculator. Go ahead and open the SpreadSheet_WDD.xlsx
file. We now need to just fill in our pre/post stats in the treated and control areas. Since we are not doing a displacement analysis, we can just fill those cells in as 0. Lets go ahead and do the analysis for thefts of motor vehicles. It should look like this:
So we get an estimate that relative to the control area, during the treatment period thefts from motor vehicles decreased by an additional 14, but that change is not statistically significant, and we have a 95% confidence interval of [-55 to 27]
.
That is it! Easy peasy this week compared to prior. And that is intentional, I made the tool so all you needed was pre and post crime counts, and it works even if the periods are not the same (in this data there is more pre-data than there is post data).
For your homework, conduct the WDD analysis for the additional crimes of Assault, Burglary, Robbery, and Theft. Place the results in a simplified table, e.g. something like:
Crime-Type WDD-Estimate WDD-Low-CI WDD-High-CI
Theft of MV -14 -55 27
.
.
.
For 5 points extra credit, I want you to check the parallel trends assumption. Using just the 14-15 data, make a chart of the quarterly number of burglaries in both the treated and control area. Interpret the whether the graph shows the parallel trends assumption is reasonable for this data. (This will require creating your own quarterly variable, and doing the aggregations.)