Dot Plots and Histograms
In this activity, you will use Fathom to gather data from the internet.
You will look at two different techniques for capturing data from the internet
into a collection box in Fathom. Once you have captured the data, we will
look at two different ways of describing the distribution of the data graphically,
dot plots (frequency plots) and histograms.
Drag and Drop
Open a new Fathom window as shown in Figure 1.
Figure 1.
You are now going to do a comparison of annual rainfall in Eureka, California
with that near the Civic Center in Los Angeles, California. Start by visiting
the Los Angeles Almanac at http://losangelesalmanac.com/
. Near the bottom of this page is a category for weather and a link
for rainfall
which leads to a page with the link Total
Seasonal Rainfall (Precipitation) at the Los Angeles Civic Center by Season
Since 1877 . This last page contains an HTML table with annual rainfall
levels at the Los Angeles Civic center dating from 1877. The first few
rows are shown in Figure 2.
Figure 2.
The image in Figure 3 shows the address bar of Internet Explorer (your
browser may be different).
Figure 3.
Note the Explorer Icon
in the address bar. Click on this with your mouse and drag this icon into
the Fathom window. The result is shown in Figure 5.
Figure 5.
Note that a collection box has been created. With the collection box
highlighted (single-click it with your mouse), drag a Case Table into the
Fathom window, as shown in Figure 6.
Figure 6.
As you can see, the Case Table contains the rainfall data from the web
page. There is no guarantee that this will always work with Fathom and
web pages containing HTML tables, but it is certainly something to try.
It beats typing the data in by hand.
Excel Data
Sometimes people store data in Microsoft Excel files on the internet. Should
you come across one of these data sets, you can open the data in Microsoft
Excel, then copy and paste the data into a Fathom Collection box. First,
let's locate some data that's been store in an Excel file. At the Humboldt
State University Department of Geology, we can find a data set containing
annual rainfall levels for the city of Eureka, California. These are located
on the page http://www.humboldt.edu/~geodept/geology531/531_datasets_index.html
. You will find a link there for Eureka Annual Rainfalls (Excel). Right-click
this link with your mouse, then select Save Target As ... from the ensuing
popup menu. Select a folder in which to save the file. You need to remember
the name of the file (eureka_annual_rainfalls.xls) and the location where
you save the file. Next, open Microsoft Excel and open the file eureka_annual_rainfalls.xls,
the first few entries of which look like that shown in Figure 7.
Figure 7.
One key requirement for Fathom data sets is the requirement that each
column of data has a "header" or "column name." In the case of the Excel
data in Figure 7, the columns are named Year and Eureka Precip.
You are now going to copy and paste these columns into a Fathom Collection.
First, drag a new Collection Box into your Fathom window as shown in Figure
8.
Figure 8.
Next use your mouse to highlight the Excel data columns containing the
Los Angeles rainfall data as shown in Figure 9. Then select Edit->Copy
to copy the data to the Clipboard.
Figure 9.
Next, return to the Fathom window and right-click the new collection
icon with your mouse. A popup menu will come up as shown in Figure 10.
Select Paste Cases from this popup menu.
Figure 10.
Highlight the new collection box (single-click it with your mouse) then
drag a new Case Table into the Fathom window. This will allow you to examine
the L.A. rainfall data, as shown in Figure 11.
Figure 11.
Stacking Attributes (Variables)
You could easily make separate dot plots for each data set, but it would
be better if you could make simultaneous plots on the same scale in order
to more easily compare the rainfall of the two cities. Fathom has a utility
that will help us with this task called Stacking Attributes. First,
right click the attribute Season... in the first case table and
select Delete Attribute from the ensuing popup menu. Next, do the
same to delete the InchesA... attribute. This leaves us with the
single attribute TotalInc... in the first Case Table, as shown in
Figure 12.
Figure 12.
Next, in the second Case Table, the one containing the L.A. data, single-click
the attribute EurekaP... to highlight the data column, as shown
in Figure 13.
Figure 13.
Next, go to the Edit menu in Fathom and select Copy Attribute
from the menu. Next, right-click the first collection icon (not the Case
Table) and select Paste Attribute from the ensuing popup menu. This
will add the rainfall data to the first case table, as shown in Figure
14. You can now safely delete the second collection icon and its Case Table,
as shown in Figure 14.
Figure 14.
Now you will "stack" the attributes. Single-click the collection box
to highlight it. Then select Stack Attributes from the Analyze
menu, as shown in Figure 15.
Figure 15.
The result is a new collection icon. If you highlight this collection
icon (single-click it with your mouse), then drag a new Case Table to the
Fathom window, you can inspect its contents, as shown in Figure 16.
Figure 16.
The result is a little difficult to display in an HTML page, but the
idea is simple. The data in the first column (L.A. data) has been "stacked"
atop the data int he second column (Eureka Data). Also, a categorical
variable has been formed with two values, TotalInc... and EurekaP....
If you use the scroll bar in the second case table, you can scroll down
until you see the Eureka data, as shown in Figure 17.
Figure 17.
Dot Plot
You can now safely delete the we13 collection and its case table, as shown
in Figure 18. You want to keep the "stacked" collection and its Case Table.
Drag a new Graph object to the Fathom window. Drag the Value attribute
from the Case Table to the horizontal axis and the Group attribute
to the vertical axis. This results in "stacked" histograms on the same
scale, where one can easily compare the rainfall from the two cites, as
shown in Figure 18.
Figure 18.
Histogram
To change to a histogram is a simple matter. In Figure 18, note the Dot
Plot in the upper right corner of the graph object in the Fathom window.
Click this with your mouse and select Histogram from the drop down menu,
as shown in Figure 19.
Figure 19.
The result is "stacked" histograms, as shown in Figure 20.
Figure 20.
Homework
Visit Counties
Ranked by Size at the U.S. Census Bureau. Perform the following tasks.
-
Select the link to South Dakota. Drag the HTML icon for the resulting page
into a Fathom window.
-
With the collection box highlighted, drag a Case Table into the Fathom
Window. Delete the first case that gives a total for the state. We only
want the population totals for each county of the state. You'll also have
to delete the any last cases at the bottom of the Case Table that contain
spurious data. Delete all attributes except for the fourth, which contains
the population data for each county as of July 1, 2001. Double-click the
name of this attribute and change it simply to SouthDakota.
-
Go back to Counties
Ranked by Size and select the link for North Dakota. Perform the same
tasks as in item #2, but this time name the attribute NorthDakota.
-
Copy the attribute NorthDakota into the first Collection icon, then
delete the second collection icon and its Case Table.
-
Stack the data in the Collection item, then use the stacked data to draw
"stacked" dot plots and histograms to compare the population data for the
counties of each state.
-
Obtain a printout of your result and submit with your homework.