This week you do not have a lesson, you have a test. But you still have an ArcGIS tutorial. In this tutorial I will give an example of downloading data and creating a geocoding address locator using Dallas data. For your homework I will then have you geocode some address points in Dallas. This will be helpful in preparing you for you final projects as well – you will likely need to download some GIS data for your project. You also may need to geocode address data for your project, and you will learn how to do that in this weeks tutorial as well. All of the data for this tutorial you will be downloading yourself.
Downloading data does not seem too hard, but there is an extra layer of complexity with GIS data that often trips people up – projections. Many police departments now disseminate crime data – sometimes it is projected into some local reference system, and sometimes they have latitude and longitude coordinates given. When it is a projected, XY coordinates you need to figure out that projection before you can properly map the data.
For this example, I want you to go to the Dallas open data site, and download two files:
To download the data to your local computer, at the top of the screen hit the Download button.
Then in the dropdown select the option CSV for Excel
, and then save the file somewhere on your local computer.
Do the same for both the 2015 and the 2016 data. These two files contain instances in which officers reported using force, such as using a taser during an arrest. Now open the 2016 data in Excel by double clicking the csv file, and scroll to the rightmost columns, labeled X and Y.
These two columns contain the XY coordinates we will be using to map out the use of force incidents. Note that these XY values are around 2.5 million and 7 million.
Next we will download some data from the Dallas GIS site. Pro-tip, simply googling “[Your city of interest] GIS” or “[Your city] shapefile” is the simplest way to find these websites. Many cities have a special GIS department that you can download data from. From that page download the Streets Shapefile
and the City of Dallas City Limits
.
After you download those files, unzip them, and then open up a new ArcMap document. At first only add in the streets shapefile.
Double click on the layer and rename it Street Centerlines
.
For this next part, we will be adding in a table of XY coordinates and turning it into a set of points on the map. So first add in csv file for the 2016 response to resistance data. Then right click the table and select display XY data.
ArcMap guesses the correct fields for the X and Y coordinates, and autofills the X field and Y field. If you use more exotic names for those fields though, you will need to use the dropdowns to select the correct field.
Hit OK, and then select OK for the warning about the “Table does not have Object-ID field”, it is not important for now. Now we see that our points are imported, but they are nowhere to be found on the map!
Right click on the point layer in the table of contents, and then select “Zoom to Layer”, you should see a point cloud that approximately shows the shape of Dallas.
So why do these points not line up with the streets we previously imported? It is because the use of force points are projected, but the street centerline file is not projected, it is just in latitude and longitude coordinates. Go ahead and right click on the point layer and remove it.
Then use “Zoom to” to go back to where the street centerlines are. Next, import the Dallas City outline layer that we downloaded from the GIS site.
Now what we are going to display the XY use of force data again, but this time we are going to assign the points the right projection. Right click again on the 2016 resistance table, and select display XY data. Instead of hitting OK, we are going to hit the Edit button in the Coordinate System of Input Coordinates section.
Now from this dialog, we are going to select the little globe that has the grid lines (graticules) on it, and then hit the Import option.
From here, navigate to the CityLimit.shp file, select it, and hit Add.
You will now see in the favorites section an option for NAD_1983_StatePlane_Texas_North_Central_FIPS_4202_Feet
. Select that option and then hit OK.
Now this is what the Display XY data dialogue should appear as, and you can go ahead and hit OK now.
Hit OK again for the same warning as before, and now the points should be lined up on the map.
In the future, if you have projected data and you don’t know the projection, a simple way to guess it is to find other GIS data for the city and then see if it lines up. If you have problems with this for your final projects also feel free to send me an email, and I will see if I can help.
As a note, if your data are in latitude and longitude, for this part you will likely want to use WGS 1984. I have an example of where to find that selection in the tutorial for lesson 10 you can skip ahead and look at. In the folder hierarchy for the projections it will be Geographic Coordinate Systems -> World -> WGS 1984
. Note X would be longitude and Y would be latitude.
If you have not done so, make sure to save your map and your progress so far for the tutorial.
Many times data we have are not already assigned XY coordinates, but have some other spatial information, such as an address of a crime location. When then need to assign those addresses an XY coordinate to be able to conduct spatial analysis and make maps. Here I am going to show you how to make an address locator from a street centerline file. Note I already discussed geocoding and street centerline files in your very first lesson, so review that if you are confused by what the street centerline file represents.
First in ArcMap, right click on the street centerlines file and select open attribute table. We will be using this information to create our address locator.
Then in ArcMap open the Catalog tab on the right hand side of the map (highlighted in red below). Then navigate to where you are saving the files for this weeks lesson. (You can see in mine I have to go down the tree quite a ways.) Note if you do not have the Catalog tab on the right hand side of your map, in the file bar at the top go to Windows
and then select Catalog
to make it reappear.
Next right click on the folder that you were saving your Dallas data in, go down to New, and then select Address Locator.
You will get a box that pops up and looks like below.
In the first line, for the address locator style, hit the folder icon towards the right, and in that new pop up select US Address Dual Ranges
Now for next blank reference data line, click the dropdown arrow and select the “Street Centerlines” file. Once that is done, you will see a bit of data is autopopulated in the fields below it. Take a bit of time to go back to the attributes table for the street centerline file. Can you figure out what the l_f_add
and the r_f_add
fields represent? How about the prefix
, type
and suffix
fields? All of these components make up the parts of the street centerline file.
It happens in this case that ArcGIS intelligently selects all of the fields for the address locator in this instance, but often times you need to use the drop downs to place different fields in different areas. Also note that if your data does not have areas (like Cities or Zipcodes), there is a known bug in ArcGIS that prevents geocoding places without an area field. (So you need to update the default like is shown in that tech report.)
The last thing you need to do is to select where to save and name the address locator on the Output Address Locator
line. Here I save it in the same folder with the rest of the Dallas data, and name it “DallasCenterlineAddressLocator”.
Once that is done, click OK, and then give ArcGIS a minute to finish make the address locator.
Next we are going to geocode the 2015 use of force data. Double click the 2015 use of force CSV file, and go the right most columns. In the spreadsheet it has a Latitude and Longitude column, but they are empty. (It happens to be an artifact of the open data site that the data were uploaded to, under the hood it geocodes the data same as we are doing.) There ends up being some of the values geocoded in the final GeoLocation column, but for instructional purposes we are going to geocode the data ourselves.
So first, after the data add a new field named City. In this column, type “DALLAS”, and then click the little button on the lower right hand side of the cell (highlighted in red below).
This will fill in the city Dallas field for the rest of the data. Then hit Ctrl
+ s
to save the csv file and this new field. Now close the csv file and import this dataset into ArcMap. Next, right click the 2015 table and select “Geocode Addresses”
Your address locator you just made should be among the potential selections (if not, hit Add and then navigate to where you save it). Select it, and then hit OK.
The dialog should then look like below.
For the “Street or Intersection” field, click the dropdown and select the ARC_Street field. Then in the output section choose where to save the file and what to name it. Again I choose the same location we have been downloading data for this tutorial, and I name the file “Geocoding_2015_UOF.shp”. Then click OK.
You will get a little progress bar that pops up, and it tells you how many addresses were matched and a few that were not assigned. Here we only have a few unassigned, 23, that we will review.
Hit the Rematch button, and you get a handy-dandy little interface that allows you to rematch addresses. In the Show Results bar at the top, select Unmatched Addresses. You will then see the unmatched addresses, the first being in my example “201 Merribrook Trl.”.
This example ends up being a pretty good cross section of the reasons addresses do not geocode. It happens that the first two in my screen shot, 210 Merribrook and 3851 Midway are outside the city of Dallas. The next examples, S STEMMONS FWY, are not specific enough to geocode at the address level (it is a long highway). Finally on row 8 we have an address that we can potentially correct, 3200 Old Mill Rd.
You see in the Candidate box there is a suggested match at 3201 Old Mill Rd. Select that option and then hit the “Zoom to Candidates” button. You will then see a yellow dot on the map that is a potential match. Go ahead and click the Match button then, and you will see the Unmatched counter in the right hand corner go from 23 to 22.
That address did not geocode because the dataset does not have a 3200 Old Mill Rd, but does have a 3201. If you are ok for your application with potentially having the address on the wrong side of the street, then assigning it to 3201 I think is perfectly acceptable. One should keep in mind with crime data these addresses are only approximate locations anyway, e.g. when a crime report is at an intersection it often occurred somewhere outside nearby that intersection.
The next address, 1700' N LAMAR ST.
is another good example of when an address will fail to geocode in ArcMap. ArcMap does a fuzzy matching algorithm to match addresses, so say “100 Mian St.” will likely be matched to “100 Main St.”, but sometimes this algorithm does not spot obvious errors like here. If you go ahead and delete the grave accent (circled in red), and then hit enter.
You will get a 1700 N Lamar st pop up in the potential candidates. So go ahead and match that.
Additional typos that will cause ArcGIS to fail sometimes are apartment numbers at the end of the string, and the next recode, 9301 Forest Lane Ln., has the suffix type listed twice.
If you can’t get an address candidate to pop up, say the address is an on ramp to a highway, you can still manually assign such a point. You know where this is, but it likely does not have an easy address in the database used to assign in the address locator. To show this in an example, go down to the address 10843 N Central SB Expwy, select the potential match, and click zoom to candidates. That candidate is shown in the red circle below.
Pretend we have additional information, such as an incident narrative, that lets us know this event occurred farther north, at the intersection of all of the highways. Go ahead and click the Pick Address from Map button and the bottom of the Rematch dialogue. Then with your cursor on the map, right click in the middle of all those intersections (see red arrow), and select pick address.
You should now see a dot where there was none previously, and your Unmatched counter should go down again.
For your homework, I want you to go through the 10 remaining addresses that did not geocode (see the list below). Make a table in your homework, and have two additional columns. In those columns place the original address, the matched address (or the coordinates if you manually placed it), and a notes column. If you were unable to match the address, put a textual description in the notes column why you were unable to match. Such as “outside the city” or “cannot find any such address”. As a pro-tip, I often use google to help this process, and they have more liberal search algorithms to pull up an address.
For reference, those remaining addresses are:
Create this table using a word processor, and make sure to have your name and date at the top. Export this to PDF, and turn in for your homework this week. Each address will be given one point in the homework. So if you say in the notes it cannot be assigned, but it clearly can by fixing a typo, you will get a point off. Or if you do not assign a matched address, but do not place anything in the notes column you will get a point off.
As an extra, I have a quick tutorial on how to create an online geocoding tool for use with an online service. Here I will show how to make it with Dallas’s public geocoding server. (Note for those working with New York data, New York has a public one for the entire state here.) In general, these will be exposed via a cities GIS webpage, but may take alittle sleuthing to find.
To add an online GIS server, click the Add Data button (the Yellow diamond with a black cross). In the “Look In” dropdown, select GIS Servers.
Then select the Add ArcGIS Server option (circled below)
In the dialog that pops up, then select the “Use GIS services” button, and click Next.
Then you will need to put in a Server URL. The link ends up being http://gis.dallascityhall.com/wwwgis/rest/services/
, so copy and paste that url into the box. You do not need to worry about putting in a user name and password.
Then click finish. On the GIS Servers tab you will then see a server for “wwwgis on gis.dallascityhall.com (user)”. (I also have another for New York, and one for Dallas in this screenshot since I did it twice for the tutorial, but those should not be on your screen.)
Go ahead and double click that Dallas CityHall server, and then select the “Lib_public folder”
In that folder, you then see alittle computer icon named Libraries
, go ahead and click that, and Add it into the map. Then in the table of contents drag the Libraries to the top (they import below the outline of the city layer for me).
And now you know where the public libraries are in Dallas! This data is actually not saved locally, but is served up over the internet. There is all sorts of online data you can import this way, and later on in the course I will show how to import nice basemaps (like Bing maps) to style roads. Another nice use of this is for satellite imagery as well.
Now go back to the 2015 use of force table, and then right click the table to geocode addresses. Note to get to the table you may need to select the “List by Source” icon at the top of the Table of Contents section (circled in red below).
In the “Choose an Address Locator to use” selector box, this time click the Add button.
Now in the gis server go to the “ToolServices” folder, instead of the “Lib_public” folder, by hitting the Up one level icon and then selecting the ToolServices folder.
In this folder then select the DallasStreetsLocator, and then click Add.
Now select this geocoding service to use and hit OK.
Then enter in the data to geocode the use of force incidents, the same as for the local address locator you used.
This for me ends up producing very close results to the local address geocoder we used in class.