Lab: Linking Non-Spatial Attribute Data to Spatial Datasets

Summary: In this lab you will acquire voter turnout data for the State of Indiana at the county level. You will also acquire county boundaries for the state. You will then link the voter turnout data to the county boundaries in the GIS to explore the spatial pattern of voter turnout across the state.

Objective: The purpose of this lab is to 1) investigate the process of linking non-spatial data to spatial data through a common field and 2) consider how using different methods to categorize data in a choropleth map leads to different patterns in the resulting map (this is important because it shows the power of the cartographer to manipulate the messages in maps). This lab includes data processing/reformatting to enable data to be imported into the GIS. Viewing data in a spatial representation often leads to insights into patterns that aren’t apparent by exploring data in a non-spatial representation.

The basic steps to this operation are:

  1. Download spatial data (from ESRI Data and Maps repository in STCs)
  2. Download attribute data
  3. Reformat attribute data (and/or spatial data if necessary)
  4. Import attribute data into GIS
  5. Link/Join attribute data to spatial data

Operationally, this involves a series of steps along these lines:

1) Develop county boundary datafile for Indiana (you can find this through the ESRI Maps & Data product), subset/select Indiana counties to make a single dataset with just Indiana counties.

2) Download the corresponding attribute data from from this website:

(State of Indiana, Office of the Secretary of State, Election Data)

a. 2004 General Election Registration & Turnout Data
b. 2008 General Election Registration & Turnout Data

3) Examine the format of the county spatial data and the voter data. Make decisions on how you can reformat data to get the two tables to join.

4) Open the voter data in a HTMLbrowser, copy the text and paste into Excel or Wordpad so you can manipulate in Excel or text editor to fix formatting issues

To do this step you can either paste directly into Excel or you can paste into a text editor and THEN import into Excel. The overall goal is to format the attribute data so it can ultimately be joined to the spatial data. ArcGIS can load Microsoft Excel files which can then be joined as attribute data to spatial data files (i.e. shapefiles or geodatabases).

Correct any formatting issues in the file – this step is necessary so that the data file you import into the GIS can be matched record-by-record to the spatial data file

  1. removing “%” signs from numeric fields (text vs. numeric)
  2. Delimiter confusion
  3. Extra records/lines
  4. Field names
  5. Case switches
  6. Modify field headings to give suitable variable names, short and no spaces

5) Join the two attribute data files (2004, 2008) to the county boundary data to produce two separate spatial data files (one for each year of data)

Note: Joins made in ArcMap are temporary. After joining a database (or excel) file to a spatial data file, you should export the spatial data file and that exported dataset will retain the joined attribute data.

Questions/Maps:

1) (10 points) Define the following (give examples):

Nominal data
Ratio data
Ordinal data
Integer data
Floating point data

2) (10 points) What data types can you use to define variables/attributes in ArcGIS?

3) (10 points) Describe the spatial pattern of the two voter datasets (2004 and 2008 data). What clustering do you see? What associations between the data and other factors do you see (e.g. rural vs. urban counties, economic factors, location factors, distance to major urban areas, ethnicity…). Feel free to use other datasets (suggestion: ESRI Data and Maps county level data) to support your answers.

4) (10 points) The county boundary dataset you used was provided to you in a specific spatial reference system (Geographic Coordinates). If you changed the spatial reference system to UTM Zone 16 N, NAD83 and recomputed the area of the county polygons would the values change? If so why? If not, why not?

5) (10 points) Make two equal-interval choropleth maps of the percent voter turnout by county in 2004 and 2008 datasets (two maps total)

6) (10 points) Make an additional single map of the 2004 dataset where you change the voter turnout class intervals such that the spatial pattern of the map differs from the equal-interval map.

7) (10 points) Make a map showing the difference in voter turnout (as percentage) between 2004 and 2008. For this map, make a higher percentage turnout in 2008 a positive value, and a lower percentage turnout a negative value.

8) (30 points) Create metadata for your new dataset that has the election data, export as an XML file and paste to the end of your lab document.