Mapping Geographically Named Data |
|
Not all geographic data is in a GIS-ready format. Commonly it is in the form of a table of values assigned to named geographic features such as states or street addresses.
Since this tutorial will be using specific maps and data, the first step is to make your own copy of the tutorial data.
- In the Windows Explorer, navigate to the network drive K:
(aka \\Software\Winsoft), open the folder Maps,
and then open the folder Introduction to GIS.
- Drag the folder mappingnames and its contents to either:
- your network drive U:, e.g. into the folder My Documents; or
- the local hard drive C:, e.g. onto your Desktop.
The folder mappingnames contains the following files:
states.shp |
stateinfo.xls |
masstracts2000.shp |
Holyoke Schools.xls |
Since some but not all of the ArcGIS components have trouble handling names with spaces or special symbols, do not rename the folders or files.
- Start up the ArcMap software (see Constructing and Sharing Maps for details).
- In the toolbar Standard,
click on the button Add
Data.
- In the dialog Add Data, navigate into the folder mappingnames; if necessary, make a new connection to it first (see Constructing and Sharing Maps for details).
- In the folder mappingnames, click on the file states.shp.
- Click on the button Add.
ArcMap will now display the map of the United States that we saw before:
When data is associated with geographically named regions, we can join it to an existing layer.
Geographic data ideally comes in the form of a layer, which is a set of geographic features with attributes assigned to them. We've already seen examples of state polygons and city points, whose attribute tables associate these features with information such as area, population, etc.
Layers also include information about the geography of the features, so they can be immediately displayed as a map, as above. Recall that the details of the geography are hidden inside the Shape field:
Very often, however, you may obtain or create a table of data whose only geographic connections are a set of names, e.g. "Massachusetts", "Alabama", etc. In other words, it doesn't have a Shape field:
Fortunately, if you have access to a map layer that defines the same geographic regions with the same names, you can join your table to that layer. Joining essentially extends its attribute table with new fields (columns), which you can then use to symbolize it, etc.
- In the Windows Explorer, navigate into the folder mappingnames.
- Locate the file stateinfo.xls, which is a Microsoft Excel workbook, and double-click on it to open it in Excel.
- Examine the workbook's structure:
- The simple rectangular arrangement of the data;
- The initial row containing the attribute names;
- Subsequent rows each containing the data of one feature (called a record);
- Each column containing a single attribute as it varies from feature to feature (called a field).
- The worksheet names, e.g.
Dates , in the tabs at the bottom of the window. You can have many worksheets in the same workbook.
- Close the workbook.
Besides Microsoft Excel XLS files, ArcGIS can
also read tables in several other formats:
- CSV
- Comma-separated values files are simple text
files with each
field in the table separated by commas (if
the data includes commas, the field value
must be enclosed in double quotes).
- TAB
- Tab-separated values files are also simple
text files, with each field
in the table separated by tabs.
- DBF
- dBASE files are an old database format that
are still used for Shapefile
attribute tables.
- MDB
- Microsoft Access files are a newer database
format.
All of these formats can also be read by Microsoft
Excel (but it will only open TAB files directly
if the extension is changed to TXT).
If you have data in these formats and are going
to be making any changes with Excel,
it's generally better to first save it as XLS;
otherwise Excel will repeatedly complain about
potential data loss when you save.
Warning: To successfully
make use of tables, their file names should start
with letters and afterwards include only letters,
numbers, and underscores.
As a simple example of joining, we'll use the data shown above:
- the map layer states.shp and its field STATE_NAME, and
- the table stateinfo and its field State.
Any map layer can have a table joined to it, as long as they have a matching attribute:
- In ArcMap,
in the toolbar Standard, click on the
button Add Data.
- In the dialog Add Data, navigate into the folder with the table to be joined, e.g. mappingnames.
- Double-click on the table to be joined, e.g the Excel workbook stateinfo.xls.
- In
ArcGIS dialogs, Excel workbooks are workspaces, meaning
you can open them
like folders and see an overview of
their contents.
In particular, they
display their worksheets and named
regions (discussed below) as separate "files".
Worksheets all have a $ at the end
of their names, as in the image at the
right.
If the join table is inside a workbook, double-click on it to see its contents, and then add the desired table to your map document by double-clicking on it, e.g. Dates$.
- After adding a table, the Table of Contents will switch to the Source view; the reason is that a table by itself is not displayable on the map, and therefore won't show up in the Display view. You wil probably want to switch back to the latter, though, since it's simpler, by clicking on the tab Display.
- In the Table of Contents, right-click on the layer to be joined, e.g. states.shp.
- The layer's contextual menu will now appear; select the menu item Joins and Relates, followed by the menu item Join....
- In the dialog Join Data, in the menu What do you want to join to this layer?, make sure that Join attributes from a table is selected.
- In the menu 1. Choose the field in this layer that the join will be based on:, select the field STATE_NAME.
- In the menu 2. Choose the table to join to this layer, or load the table from disk:,
select the table to be joined, e.g. Dates$.
- In the menu 3. Choose the field in the table to base the join on:, select the field State.
- Click on the button OK.
Alternative: Rather than use the
"internal join" described above in Steps
1-5 and 10, you can directly join an external
table in Step 10 by clicking on the button Browse;
this is a slightly easier one-step process.
However,
this "external join" has the disadvantage
that you won't
get error messages describing incompatibilities
in your table. It also obscures the location
of the table from the ArcMap user, and if
the map document loses track of it you can't
see its path to help you find it again to
fix the join.
To see the results of the joining tables procedure, try out the new
attributes:
- In the Table of Contents,
right-click on the layer states.shp and
select the menu item Open Attribute Table.
- Examine the attribute
table's structure: the fields from
the shapefile
are now immediately followed by the
fields from the joined table.
- Right-click on one of the field
headers, e.g. State,
to bring up its
contextual menu,
and select the menu item Properties… .
- In
the dialog Field
Properties, note that
the visible field name is just
an alias, and the actual field
name begins with the table name,
e.g.
Date$ , followed
by a period.
This field-name prefix ensures that
there are no conflicts between fields
from the joined tables that
might have the same name.
Note that Alias: is
editable, so you
can use it to change a field name
if you think it could be more descriptive;
it has fewer restrictions on
length and characters than the actual
name.
- Close the dialog Field
Properties and then the
attribute table.
- Symbolize
the layer with one of the
quantitative fields from the
joined table.
- Save the document by clicking on
the button Save.
Important Note: The additional fields
from the joined table have not been permanently added to the map layer,
just temporarily linked to it.
This means that if you move the table file or the map document to
some other location, the map may no longer work because ArcMap could
be looking in the wrong place for the table file.
When ArcMap can't find a file, it places a red exclamation point ! next
to its name in the Table of Contents, and you can click on ! to
start a dialog to relink it.
Once you've joined a layer and a table, sometimes you may
want to save the result as a new shapefile with
a merged attribute table:
- In ArcMap,
in the Table of Contents,
right-click on the name of the layer,
e.g. states.
- In the layer's
contextual menu, point at the menu
item Data,
then in the submenu that appears
click on the menu item Export Data….
- In the dialog Export Data,
in the menu Export,
make sure that All features is
selected.
- Near the text
field Output shapefile or feature class:,
click on the button Browseto
navigate to an appropriate location
for the new data set, e.g. the folder mappingnames,
and give it a descriptive name, e.g. states_and_dates.shp.
Remember that file names should start
with letters and afterwards include only letters,
numbers, and underscores
- Click on the button OK.
- The dialog ArcMap will
now appear, asking if you want to
add the exported layer to the map;
click Yes or No.
One advantage of using joins is that you can more easily change the
contents of the joined table.
However, many applications such as Excel won't allow you to edit the
table if ArcGIS opened it first.
If you aren't changing the join attribute name, the simplest
thing to do is quit ArcMap, edit the table, and
reopen the map; the table will be rejoined as
previously defined, with the new data.
Otherwise, you'll need to unjoin the table, remove it from ArcMap (if
an internal join), edit it, re-add it to ArcMap,
and rejoin it.
An important example of data
that is commonly joined to map layers comes
from the U.S. Census Bureau.
As another example of commonly available data
that's geographically named, we'll work with
some more census data. As you probably already
know, every ten years there is a census that
tries to obtain basic information from 100%
of the U.S. population; this is the primary
source of the data we looked at previously
in the states layer. In addition, the U.S.
Census Bureau provides many variations
of this data, as well as the results of the
annual American Community Survey that
tracks detailed information from a small
subset of the population. This data is in
tables that you must join to existing layers.
We'll also use a type of geographic region
that you might not be familiar with, census
tracts. According to the Bureau, "Census
tracts are small, relatively permanent statistical
subdivisions of a county....Designed to be
relatively homogeneous units with respect
to population characteristics, economic status,
and living conditions, census tracts average
about 4,000 inhabitants."
Two smaller subdivisions of census tracts
are also available, though we won't use them
here. Census blocks are "the
smallest geographic unit for which the Census
Bureau tabulates 100-percent data....Many
blocks correspond to individual city blocks
bounded by streets, but blocks especially
in rural areas may include many square
miles." Census block groups are
just that, and are the smallest region available
for some sensitive attributes such as income.
- In ArcMap,
click on the button New Map File.
- Click on the
button Add Data.
- In the dialog Add Data,
navigate into the folder mappingnames.
- In the folder mappingnames,
click on the file masstracts2000.shp.
- Click on the button Add.
ArcMap will now display a map of the Year
2000 census tracts in Massachusetts:
If
you right-click on the layer masstracts2000 and
select the menu item Open Attribute Table,
you should see the following:
The census tracts are uniquely named by their FIPS (Federal
Information Processing Standards)
code, which begins with the two-digit
state code, followed by the three-digit
county code, and finally the tract code.
We can use this identifier to join other
census data to this layer.
Be aware that census tracts change from decade
to decade. The Census Bureau provides cartographic
boundary shapefiles for its online data (including
other regions such as congressional districts
and metropolitan statistical areas) at http://www.census.gov/geo/www/cob/.
The Census Bureau also lets you download much
of their data from http://www.census.gov/. As an example, we'll map some
sampled poverty data from the latest census.
- If necessary, start up a web
browser:
- Click on
the menu Start.
- Point at the menu item All Programs.
- Locate your
preferred web browser, Firefox or Internet Explorer,
and click on it.
- In your web browser, visit the web
address www.census.gov.
- On the left side
of the web page Census Bureau Home Page,
click on the link American Factfinder.
The web page American Factfinder has
several useful links that let you
explore census data on-line. In particular,
the link About the Data provides
general information about the Census
Bureau's data and how it's collected
and structured. Two links, Data Sets and Download Center,
let you download data immediately,
but the former is simpler to use.
- On the left side
of the web page American Factfinder,
click on the link Data Sets.
By default the subpage Decennial Census should
be selected, as well as the tab for
the most recent year available (in
this case Census 2000):
- Further down the web page Decennial Census,
in the button group 2000,
click on the button Census 2000 Summary File 3 (SF 3) - Sample
Data:
- To the right
under the heading Select from the following:,
click on the link Detailed Tables.
- In the web page Select Geography,
in the list Choose a selection method,
the list item list should
be selected. In the menu Select a geographic type,
choose the menu item Census Tract.
- In the menu Select a state,
choose the menu item Massachusetts.
- In the menu Select a county,
select the menu item Hampden County.
- In the menu Select one or more geographic areas,
select the menu item All Census Tracts.
- Click on the button Add ▼.
- Click on the
button Next ▶.
- In the web page Select Tables,
in the list Choose a table
selection method, click
on the list item by subject.
- In the menu Select a subject,
under the heading ==== Economic Characteristics ====,
choose the menu item .... Poverty (Families and Individuals).
- Click on the button Search.
- In the list Select one or more tables,
click on the list item P87. Poverty
Status in 1999 by Age.
- Click on the button Add ▼.
- Click on the
button Show Result ▶.
- On the web page Detailed Tables,
point at the menu Print/Download,
and then select the menu item Download.
- On the web page Detailed Tables ▶ Download,
in the button group Select download options,
under the heading Database compatible (data rows only),
click on the button Microsoft Excel (.xls).
The Microsoft Excel file format is
used here because this program makes
it relatively easy to view and modify
data tables. It can also be read
directly by ArcGIS (as of version
9.2). Comma-separated values (CSV)
is probably the next best format
if you don't have Excel.
- Verify that the checkbox Include descriptive data element names is
selected, and click on the button OK.
- The data will
be downloaded as a single compressed
file named output.zip;
locate this file, open it, and drag
its contents to the folder mappingnames.
The ZIP file contains four files;
the actual data is in the file dt_dec_2000_sf3_u_data1.xls.
You may have noticed in this last procedure
that the Census Bureau only has data online
starting in 1990. If you want data and boundary
files from earlier years, visit the National
Historical Geographic Information System
at
http://www.nhgis.org/,
which provides a similar system.
Once you've downloaded this data, double-click
on the file dt_dec_2000_sf3_u_data1.xls to
open it automatically in Excel and
inspect its contents:
You might have noticed that the column labeled GEO_ID2 ,
the second "Geography Identifier",
is the same attribute as the
FIPS number in the attribute table of
the layer masstracts2000.
We may therefore use these two fields to
join the tables together. However, we must
first make two changes to the file to make
it compatible with ArcGIS:
- The descriptive labels in the second
row of a Census Excel file are important
for understanding the meaning of
the attributes. However, they are
too complex to be used as column
headers by ArcGIS, so we will use
the first row instead. Because column
headers must be immediately above the
data, we will swap the first two
rows:
- In the application Excel,
click on the row header 1,
just to the left of the column
header
GEO_ID .
- Select the menu Edit,
and then click on the menu item Cut.
- Click on
the row header 3.
- Select the menu Insert,
and then click on the menu item Cut Cells.
This should put the column header
GEO_ID into
the cell A2.
- Because we are
retaining the descriptive labels
in the worksheet, there will be non-table
data present. We must therefore name the
region of cells covered by the table,
and use that name when joining, instead
of the worksheet name (here called Sheet0).
- Click and hold on the cell in
the first row and first column
of the table, the cell A2 containing
the column header
GEO_ID .
- Select the
entire table by dragging down
and to the right, until you reach
the cell in the last row and
column of the table.
- Click in the Name
Box that's located
above column A,
type a descriptive name,
e.g.
Poverty ,
and press the key Enter.
Note that spaces, numbers,
and most special characters
aren't allowed in these names.
Named cell references
are known everywhere inside a workbook,
including within other worksheets.
Names can be added and deleted by
selecting the menu Insert,
followed by the submenu Name,
and then clicking on the menu item Define....
- Select the menu File,
and then click on the menu item Save.
To see the results of the Making Census File
Compatible procedure, try out the new attributes:
- Follow Procedure
1 and join the census
table dt_dec_2000_sf3_u_data1.xls to
the layer masstracts2000 using
the fields
GEO_ID2 and FIPS ,
respectively.
- Symbolize
the layer with one of
the quantitative fields from
the second layer, e.g.
P087002 (Population
for whom poverty status is
determined: Income in 1999
below poverty level), normalized
by P087001 (Population
for whom poverty status is
determined: Total).
While joining tables may appear
straightforward, they need to be in a certain
format to ensure success.
Before joining two tables using a particular
attribute, it's generally a good idea to
ascertain the data type of that
attribute in the layer's table. The reason
is that not only the join attribute's values
but also its type must be compatible
in the two tables, and appearances can be
deceiving.
For example, the POP2000 number
in the attribute table for masstracts2000 may
appear to be an integer but it could actually
be text or a real number.
The following table describes the most common
data types.
Some Common ArcGIS Attribute Data Types
Data Type |
Value Represented |
Minimum Value |
Maximum Value |
Maximum Significant
Digits/Characters |
Short |
Integer number |
-32,768 |
32,767 |
5 |
Long |
Integer number |
-2,147,483,648 |
2,147,483,647 |
10 |
Float |
Real number |
-3.4
x 1038 |
1.2
x 1038 |
6 |
Double |
Real number |
-2.2
x 10308 |
1.8
x 10308 |
15 |
Text |
String of characters |
|
|
254 |
We can see in the image above that the data
type of the FIPS attribute in masstracts2000 is Text.
Question: Could
it be another type?
To use this attribute to join an Excel table,
the corresponding attribute data type must
also be Text. If you look
at this Excel
table and the join field we used, GEO_ID2 ,
you'll note the little green flag in the
upper left corner of each cell; it indicates
that the numbers are actually formatted as
text. (Hint: this can also be ascertained
from their alignment on the left; Excel aligns
numbers on the right.)
Another formatting incompatibility to be aware
of is that the content of a cell in the Excel
join column cannot wrap, i.e. it
must all be on one line, and cells in the
table shouldn't be merged cells,
either.
More generally, the data in the two join columns
must match exactly. In particular,
watch out for extra spaces between words
and at the beginning and end of data.
Summary: Making an Excel File Compatible
with ArcGIS
While Microsoft Excel can make it easy to
manipulate data tables, it also has its own
way of doing things with which you'll need
to be familiar to make extensive use of it.
Such use is outside of the scope of this
class, but if you follow the recommendations
below, your data tables should be compatible
with ArcGIS.
Generally speaking there are five things
you need to do to make Excel data compatible
with ArcGIS: name it properly, create
a row of field names, below that arrange
your data in a plain rectangular array,
name the table, and make sure the join
fields match. Below are general descriptions
of how to do this.
- Name the file:
it should start
with a letter and afterwards include only letters,
numbers, and underscores.
- Create a row of field names:
- The very
first row in a table must contain
unique names for each column.
Usually they will in some way
describe the attribute that will
be in that field, e.g. Population,
ZipCode, etc.
- Field names must begin with letters,
and after that can contain letters,
numbers, or the underscore _
. They cannot contain other special
characters such as the period
. or hypen -,
or spaces (and be careful that
you don't have any spaces at
the beginning or end of the names,
too!). Note that
field names are also case
sensitive (upper
and lower case letters are distinguished).
If your joined data appears
as <null>, then check the
field names for an illegal
character.
- Field names
cannot be one of a long list
of reserved
words, e.g. All. If
your joined data appears
as <null>, then check the field
names against this list.
- For some types of joined tables,
field names must be ten characters
or less, though in other types you
can use up to 64 characters.
It's a good idea to use short
names in any case.
- Arrange your
data in a plain rectangular array:
- Every map feature, such as a
state or city, must have its
data in a single row.
- Every column/field
should contain the same kind
of information for each feature,
e.g. all population values should
be in a single column. Blanks
are allowed if particular data
is missing. Also make sure the
values have a consistent data
type, e.g. all text, all integers,
or all real numbers.
- All record and feature data must
be contiguous, i.e. there must
be no other data or blank rows
or columns separating the data,
and it must begin immediately below
the field name row.
- Name the table:
- If you have non-table data in
the cells around your table,
e.g. explanatory notes, you'll
want to select just the range
of cells covered by the table
and give it a unique name.
- If you don't
have any other data in the worksheet
besides your table, and it begins
in cell A1, then the table can
be referenced by its worksheet
name. It's highly recommended
that you change the worksheet
name to something more illuminating
than
Sheet0 . Whatever
name it has, ArcGIS will see
it with a $ at the end, e.g. Sheet0$ ,
indicating that it's using the
entire contents of the worksheet.
- Unlike field names, worksheet
and cell range names have few
restrictions like those described
in (1)(b) above. However, names
with spaces and special characters
in them will appear with single
quotes around them, e.g. 'My
Sheet$'.
- Make sure the
join fields match:
- Make sure the data types in the
join fields are compatible: both
text, both integers, or both
real numbers.
- Make sure
the Excel join field doesn't
wrap its text and doesn't have
any merged cells.
- Make sure the values in the two
join fields match exactly, e.g.
there are no extra spaces, variations
in case, etc.
- The two tables
don't need to have the same number
of records, e.g. some features
could be missing in one table
or the other. If the map layer
is missing a record that appears
in the join table, the latter
will be ignored, and if the join
table is missing a record that
appears in the map layer, its
values will appear as
<Null> .
Street addresses can be geographically
located when you have a special street layer
and a geocoder.
Geographic data often comes in the form of
a table of street addresses, for example
of schools or businesses, along with attributes
describing them such as their name, purpose,
etc.:
A
full street address includes a street number,
street name, city, state, and zip code. Like
other geographically named data, street names
can be associated with a street layer to
get a very rough location. To get a more
accurate position, we need to know where
the street numbers fall along the street.
If a street layer contains details about which
addresses lie within which blocks, it can
be used with a program called a geocoder to
place the addresses on the map as points
along the streets. Address locations will
be approximate, because only a block's beginning
and ending addresses are usually provided,
and others are linearly interpolated along
the street (see the block at the right, which
places the address 317). In addition, street
layers have varying degrees of accuracy.
Street layers are commonly available from
cities and towns as well as commercial entities.
The ArcGIS software suite comes with the
collection ESRI Data and Maps,
which includes the commercial package called Street
Atlas North America, and a geocoder that works
with it.
- In ArcMap,
in the toolbar Standard,
click on the button Add
Data.
- In the dialog Add
Data, make a new connection
to the folder K:\Maps (see Constructing
and Sharing Maps for details).
This network folder is where
Amherst College stores a large
amount of data for use in maps.
- Navigate into the folder ArcGIS Books-n-Data\ESRI Data & Maps 2008\streetmap_na.
- Click on the
file StreetMap North
America.lyr.
This is a layer file,
which references data in one or more
additional files, along with information
about how to symbolize them.
- Click on the button Add.
- You will now see the dialog Geographic
Coordinate System Warning;
you don't have to worry about
this for now, so click on the
button Close.
- Click on the
button Zoom In,
and click-and-drag a rectangle around
the City of Holyoke (north of Springfield
and I-90, east of I-91).
- In the Table of Contents,
turn off and back on the layer StreetMap North
America by
clicking on the checkbox to
the left of its name, and notice
the relationship of the two data
sets.
Let's now geocode a set of addresses using
Street Atlas' street information:
- In ArcMap,
click on the menu Tools,
then select the submenu Geocoding,
followed by the menu item Geocode Addresses….
- The dialog Choose an Address Locator to use… will
appear with a list of address
locator databases. If this is
first time you've geocoded this
list will be empty, and you'll
need to choose the correct encoder
for the street address database
format:
- Click on the
button Add….
- In
the dialog Add Address Locator,
navigate into the folder K:\Maps\ArcGIS Books-n-Data\ESRI Data & Maps 2008\streetmap_na\data.
- Double-click
on the file Street_Addresses_US.
- Continuing
in the
dialog Choose an Address Locator to use…,
click
on the button OK.
- In the dialog Geocode Addresses: Street_Addresses_US,
the menu Address table: might
already list your data set if
you've previously added it. Otherwise:
- Click on the button Browse and
navigate into the folder
with the table to be joined,
e.g. mappingnames.
- Double-click
on the table to be joined,
e.g the Excel workbook Holyoke Schools.xls.
- In the workbook, double-click
on a sheet or a named region,
e.g. HolyokeSchools.
- Continuing in the dialog Geocode Addresses: Street_Addresses_US,
verify that
the Address Input Fields match
the correct ones in your data
table.
- Make sure the Output
shapefile or feature class: is
an appropriately named and
located file, e.g. HolyokeSchools.shp in
the folder mappingnames.
- Click on
the button OK.
- The
dialog Gecoding Addresses… will
now appear and provide a summary
of the geocoding. It will tell
you how many of the addresses
in the table were matched, tied,
or unmatched (e.g. if a zip code
is incorrect it may end up tied
with a slightly different street
name).
- If you have not matched all of
the records (very common),
you may want to look at how the
matches are made. Click on the
button Rematch;
the dialog Interactive Rematch will
appear. It lists each
record and gives it a
score from 0 - 100 judging
the quality of the match.
You can use this dialog
to interactively improve
the match, e.g. by
correcting bad data or
choosing a more likely
address (N.B. sometimes the geocode
database can be wrong,
too!).
This dialog can be recalled later by clicking on
the menu Tools,
then selecting the submenus Geocoding and Review/Rematch Addresses,
and finally clicking
on the menu item Geocoding Result: ….
- When you are finished matching,
click on the button Close.
- The resulting
data will be added to the
map with a distinguishing name,
e.g. Geocoding Result: HolyokeSchools.
Note that these points on the map are
created relative to Street Atlas' description
of the streets; this may differ from
other descriptions, e.g. the local Planning
Department may have more accurate data,
while an old map may be less accurate.
|