Overview

This tutorial covers how can interpolate a set of statistical measurements for one area to a different set of geographic areas. The procedure we will be conducting here is proportional areal allocation, and this can be considered the simplest type of dasymetric mapping procedure. Here we are going to take census tract data and interpolate it to Albany patrol zones. (Zones officers are generally assigned to patrol by the Albany police department.) The GIS data can be downloaded via blackboard or from dropbox at this link, https://www.dropbox.com/s/2eygf5n8wb7kl12/Week4_DasymData.zip?dl=0.

As a word of caution - this is the tutorial that students have previously had the hardest time replicating in prior versions I have taught the course (although this is not uniform - there are always some students who do fine and others who have problems from week to week). It is partly because there are many steps and you need to perform each one exactly as I have specified. Also sometimes it can be confusing what the point of several intermediate steps are. In a nutshell, we will be doing to the same operation I talked about in the lecture notes with the overlapping rectangle and the circle. So if you are having troubles, take your time, redo the steps, and try to understand what the point of each intermediate output is.

Although this tutorial is harder than many others in the course, it really demonstrates the power of using GIS - to take one set of measures and transfer them to different overlapping areas. Such an analysis would not be possible without GIS. Being able to do complicated geographic manipulations like this is really what learning to be a profession GIS analyst is all about.

1 - Calculating Area Statistics

Last week we made a nice layout of Albany in the tutorial. Instead of having to redo that work, we are going to open the map you made last week. Here is what my map from last week looks like.

We are going to get rid of the contents of the map, but the layout guidelines and text will still remain. In the table of contents, make sure the source view is selected (circled in red). Then rick click on the layers and select Remove.

Do this same process for any layers and tables that are in the map. Here is what my empty map now looks like. You can see in the layout view that I still have all the guidelines. (I also changed the text here to signify Week 4 in class and the date.)

Now we are going to add in the data files for this weeks tutorial. Navigate to wherever you have saved the files, and import the two polygon layers and the csv file. Make sure to save this as a new map document (File -> Save As) - don’t overwrite your last weeks homework.

This is what my map now looks like in the map view (circled in red). Note - if the map looks stretched out compared to mine you probably did not project your map in the last tutorial. You will need to have it projected to do some of the subsequent geographic data transformations. (So consult last weeks homework to project the map.) Click the layers on and off in the table of contents so you can see how census tracts and police zones do not align in any consistent manner. While they share some edges based on major thoroughfares (like Western Ave.) they do not generally line up.

Open the attribute table of the csv file CT_Veh_and_Vac. This contains data at census tracts for the number of vehicles commuters use (5 year 2012 American Community Survey) and the number of vacant residences (quarterly HUD dataset for Q4 of 2015 compiled using USPS data).

Right click on the TotVeh column and then select Statistics.

This brings up a histogram of the distribution, along with some summary statistics. Take note of the sum, 30,385. When doing interpolation, we typically want the values of our resultant layer to add up to the same values of the source layer. (This means that for every 1 car estimated in the source it gets assigned somewhere in the target.)

Close the CT_Veh_and_Vac, then join the CT_Veh_and_Vac table to the Alb_CenTrac polygon layer. (If you do not remember how to do this, refer back to last weeks tutorial.) Again, JOIN THE CSV TABLE TO THE CENSUS TRACT POLYGON LAYER before proceeding.

Then open the Alb_CenTrac attribute table, and add a field named AreaCT (all variables we will add in this tutorial will need to be of the type Double). Last week I had you use the field calculate to make a transformation between separate variables. This week you are going to add in geographic information by right clicking on the field and selecting Calculate Geometry…

This will bring up a dialog where you can add in geographic data into the attribute table. It depends on the type of geometry for what you can add in. For polygons, you can add in area or the centroid coordinates of the polygon. For polylines you can add in the length or direction, and for points you can add in the X and Y coordinates.

Here we are going to add in the area of the census tracts as square miles. Note I am using the coordinate system of the data frame - NAD 1983 StatePlane New York East FIPS 3101.

Click ok and then check out the field to make sure the calculations seem reasonable. You can see that the polygons are all the tiny intersections of the census tract layer and the patrol zone layer.

2 - Taking the intersection of two polygon layers

Now we are going to take the intersection of the two polygon layers - census tracts and police zones. In the lecture notes this is equivalent to the part where I take the rectangles and the circles and split them up into all of the individual overlapping areas.

In the file menu, click Geoprocessing. This brings up a selection of the most regular types of geographic data manipulations people conduct. Here we are going to take the intersection of the two polygon areas.

Now we just need to fill in several parts of this dialog. First, in the input features dropdown select the census tracts first, and then select the patrol zones second. (The order does not matter, but the resulting attribute table later on may have different variable names or order of the variables if you specify the layers in a different order. So follow along as closely as possible to replicate my results.)

Next, choose to save the output feature class as a shapefile. Here I choose a location on my local computer in a dropbox file, but you will likely save it to your flashdrive. The shapefile name I will be saving to is Inter_CT_PZ.shp - to signify the intersection of census tracts and patrol zones. Make sure to save this as a shapefile (often these save procedures default to file geodatabases, we won’t be working with those in this class though - so always save to a shapefile.)

One you have selected the same options as above, click ok. This will then add the new layer into the map.

3 - Proportional Allocation to the Intersection Area

Now open up the attribute table of the intersection layer, and add two new fields, AreaInter and PropInter (again both as double types).

For the first field calculate the area in square miles for the new intersection polygon using Calculate Geometry. (Same process as earlier in the tutorial.) For the second field, I want you to calculate the proportion of the area the intersection layer is of the original census tract layer.

In my shapefile, when doing the intersection ArcGIS renamed my census tract fields as Alb_CenT_1, Alb_CenT_2, ….. The original area of the census tract ends up being renamed Alb_CenT_1. So to calculate the proportion of the intersection area with these variables in shown in the below screenshot. (You can also see my other variable names.) Note, if in the intersection dialog you inputted the layers in a different order than me your variables may be different (and in a different order). You can figure out what fields refer to by examining the attribute tables for the original polygon layers.

In my example it then ends up that you should calculate PropInter as [AreaInter]/[Alb_CenT_1]. This calculates the proportion of the area that the smaller intersected area is of the larger census tract – so these values should always be less than 1.

Next we will be calculating our estimate of the total vehicles per this intersection layer by multiplying our proportion of the area by the original count per the larger census tract. So if the census tract had an area of 2 square miles and 1,000 cars, if the intersection area is 1 square mile (half the area) it will be estimated to have 500 cars (half the number of cars).

So add another variable into the intersection attribute table, here I name it PropVeh. Then calculate this field as the number of vehicles (in the original census tract) multiplied by the proportion of the intersection area (what we just calculated). In the renaming process ArcGIS ended up renaming the Vehicle count field (in the original csv table) to CT_Veh_a_1. So we will be calculating [PropInter]*[CT_Veh_a_1] in the field calculator.

Here is a final screenshot of my attribute table. Make sure your results match mine. If they do not - retrace your steps to see if you missed any particular part along the way.

4 - Dissolving the estimates back to Patrol Zones

Now we have estimates of vehicles at the intersection areas - but that is not what we wanted. We wanted estimates at the patrol zone level. To get these, we will be aggregating the results back up to the patrol zone level using a geographic operation called Dissolve. This is the part in the lecture notes where I take the individual estimates from the circle and sum them to get the estimate for the entire circle.

In the file menu select Geoprocessing and then dissolve.

The help on the right hand side has a diagram to show the geographic operation (ditto for the other operations in the geoprocessing menu). First in the Input Features dropdown select the intersection layer. Then the Dissolve_Field(s) select the ID field. This corresponds to the initial patrol zone ID in the patrol zone polygon layer, and is the field that tells ArcGIS to dissolve multiple polygons into one that share the same value for this field.

Next in the Statistics Field(s) dropdown select the PropVeh variable that you made.

This will produce a red X signifying an error message. Here it is just because we need to specify how exactly to aggregate this variable in the resulting dissolve process. Here we want to sum up the vehicle estimates.

Now make sure before clicking ok that your options (highlighted in red) are the same as mine in the screen shot below. Before clicking ok make sure to have the output feature class specified to save a shapefile on your flashdrive (here I name it PZ_wVeh.shp).

Once you click ok, ArcMap will add your new layer into the map. Open up the attribute table for PZ_wVeh, right click on the SUM_PropVe field and select Statistics. You can see that the sum of the vehicle estimates equals 30,385 - same as in the original census tract layers. While individual patrol zones have a non-integer estimate, the sum of all estimates is the same citywide.

5 - Some more binning and styling for choropleth maps

Last week I introduced choropleth maps. I am going to go over more specifics about making bins, styling the polygons, and then adding labels into the map in this section.

In your new PZ_wVeh layer, right click on the layer in the table of contents and select Properties. Navigate to the Symbology tab, and in the Value dropdown select SUM_PropVe. For the color ramp select the greyscale ramp, and for classification select 4 classes. I have these options highlighted in the screenshot below.

Next click the Classify button (to the right of where you set the number of classes). This brings up a dialog that has a histogram as well as other statistics for the field. In the top dropdown in the Classification section, select the equal interval option. Your bins should recalculate as below.

Since these equal intervals are so close to 1,000 each, we are going to adjust the break points slightly. In the Break Values section on the right hand side, select the lowest bin and change it to 1,000. (You can also drag the blue lines in the histogram, but it is very difficult to get an exact value.) Then change the 2nd and 3rd bin to 2,000 and 3,000. Leave the 4th bin at 4,059. (In general, when changing the bins arbitrarily like this, it is easier to start from the low numbers and work your way up.)

We do this to make the bins simpler to read and remember, but leave the bottom and top numbers so people know the range of the estimates. In the end your classification screen should look like below.

Now click ok, and you will be back in the Symbology tab page. When using the grey scale color ramp, the default grey border ends up being confounded with the areas. That is, some areas look like they don’t have an outline at all, because the fill is very near the same grey shade as the outline. To fix this we will change the outlines for all the areas to be white. Left click on the Symbol header (circled in red below) and select Properties for all Symbols…

Now for the outline color select white, and change the width to 1.

Click ok, and then in the layer properties dialog navigate to the Labels tab. Click the labels in the layer on, and then in the label field select the ID field. Note here is where you can change the size, font, bold/italics of the labels etc. By default Arc places labels around the centroid of the polygon.

Now update the title and legend in the map to match the new items. Here is a screenshot of my final map in the layout view. You are now done, make sure to save your file before you go onto the homework.

Homework

For your homework go back to the intersection layer and calculate the proportional estimate of vacant houses. Then aggregate those estimates up to the patrol zone using a dissolve operation. Exactly the same as you did in the tutorial for vehicles.

Then make a choropleth map of the vacant addresses per patrol zone. Choose your own number of bins and the bin cutpoints. Export the map to a PDF file and turn it in. Make sure to have your name along with all the other map essentials (legend, title, scale bar, etc.).