Introduction to Research Methods in Political Science:
The POWERMUTT* Project
(for use with SPSS)*Politically-Oriented Web-Enhanced Research Methods for Undergraduates  Topics & ToolsResources for introductory research methods courses in political science and related disciplines

A picture is said to be worth a thousand
words. Tables and graphs, properly designed, can provide clear pictures
of patterns contained in many thousands of pieces of information.
In this topic, we will describe several ways of displaying information about
categorical variables in tabular and graphic form. In later topics, ways
of displaying information about continuous variables will be explained.

Frequency Tables

A frequency table (or frequency
distribution) displays numbers and percentages for each value of a
variable. It is useful for categorical variables (that is, those with
values falling into a relatively small number of discrete categories, such as
party identification, religious affiliation, or region of a country) rather than
for continuous variables (such as age in years or gross domestic product in dollars).

The following frequency distribution
shows, by region of the country, how many state legislatures are controlled by each major party. (In four states, each party controls one of the state's two houses, while in one state, Nebraska, the legislature is officially non-partisan.

The first column in the table provides a
label for each category of the variable. The second and third columns
show, respectively, the number and percent of cases in each category for all
cases. The fourth column shows the percent in each category after
eliminating cases for which we do not have information (missing data). Since we know the party composition of every state legislature, the fourth column is identical to the third in this case. The last
column shows the cumulative percentages as one goes from the first to the last
category. Note that this last column makes sense
only if the values of the variable can be meaningfully ranked. In other
words, cumulative frequencies assume at least ordinal level measurement. The numbers in this column make no sense in this example, since it wouldn't be meaningful to say that "98 percent of state legislatures have split majorities or less."

Contingency Tables

A contingency table (also called a crosstabulation, or crosstab for short) displays the
relationship between one categorical variable and another.
It is called a “contingency table” because it allows us to examine a
hypothesis that the values of one variable are contingent (dependent) upon
those of another.

The following crosstabulation shows the
relationship between control of state legislatures and region of the country at the start of the 113th Congress (2013-2014):

There are several important things to
notice about the way in which the table has been set up:

The table consists of a matrix of rows and columns. Since
there are four rows and four columns, it would be referred to as a “four by
four” table. A table with four rows and three columns would be
referred to as a “four by three” table. (Note that the number of
rows is always given first.) The values of one of the variables are
placed in the rows and those of the other in the columns.
Usually, as in this example, the dependent variable is the row variable and
the independent variable is the column variable. There is, however,
no requirement that this be the case. If it fits the available space
better to put the dependent variable in the columns, go ahead and do so.

The number of cases in each cell of the table has been converted
to a percent. In this example Democrats held majorities most state legislatures in the Northeast and West, while Republicans controlled most state legislatures in the South and Midwest. By
converting numbers of cases to percents, we facilitate making comparisons
among regions, despite differences in the number of states in each region, since each region totals to the same 100%.

In this instance, we have percentaged down the columns rather than
across the rows. In using contingency tables to test hypotheses[1], always percentage in the direction of the
independent variable. This is necessary because we are testing the
idea that different categories of the independent variable will tend
to have different values for the dependent variable.

The table includes information on the number of cases (N) on
which the percentages are based. When there are a large number of cases in each category, we can be more confident that the patterns we observe aren't merely coincidental.[2]

Do not let all the trees get in the way of
seeing the forest. In interpreting a crosstab, it is crucial to focus on the overall picture. In this case, the table shows that there are substantial regional differences in party strength. Don’t get bogged down in
the details.

Making Tables Presentable

The frequency distributions and crosstabs are presented above just as they were generated by SPSS. This is the way in which tables will normally be presented in POWERMUTT, so that you can run your own analyses and compare your results to what is presented here. For use in a term paper, however, you will probably want tables that are more aesthetic. The following tables are a little more presentable, and also contain a bit more information, including 1) a title that briefly describes what the table is about, and 2) the source of the data (as referenced in the codebook) used to generate the table. By the same token, a lot of extraneous information has been omitted. Tables usually present information only for valid cases. Cumulative percentages are omitted from Table 1, since region is only a nominal variable. Individual cell counts and row totals are omitted from Table 2, since this information can be reconstructed if needed from the information that is provided. Ask your instructor whether it will be sufficient to copy and paste tables from SPSS into your word processor

or whether you will need more formal tables such as shown here.

Table 1:Party Control of State Legislatures, 2013

# of States

Percent

Party Control

Democrat

19

38

Republican

26

52

Split

4

4

Non-partisan

1

2

Totals

50

100

Source: National Conference of State Legislatures, "2012 Live Election Night Coverage of State Legislative Races," http://www.ncsl.org. Accessed November 10, 2012.

Table 2:
Party Control of State Legislatures by Region, 2013

Northeast

Midwest

South

West

Party Control

Democrat

77.8%

16.7%

18.8%

53.8%

Republican

11.1

66.7

68.8

46.2

Split

11.1

8.3

12.5

0.0

Non-partisan

0.0

8.3

0.0

0.0

Totals

100.0

100.0

100.0

100.0

N

9

12

16

13

Source: National Conference of State Legislatures, "2012 Live Election Night Coverage of State Legislative Races," http://www.ncsl.org. Accessed November 10, 2012.

Pie Charts

A pie chart is a simple way to show the
distribution of a variable that has a relatively small number of values, or
categories. Figure 1 provides in graphic form information similar to what table 1 (above) presents in tabular form:

Similarly figure 2 is analogous to table 2, and breaks the results down by region:

These exercises use the 2008 American National Election Study Subset. Open the codebook describing these data. Start SPSS and open the anes08s.sav file.

1. Prepare a frequency table, pie chart, and bar chart for an economic, social, or foreign policy issue of your choosing (see codebook). Crosstabulate this with several background variables (again, see codebook) that you think might influence a respondent's opinion on this issue. Cautions: 1) avoid background variables like age or income that have a large number of categories; 2) some categories of some background variables contain very few cases, and the results are likely to be unreliable. In another topic, we'll discuss how measures of statistical significance can help you better assess the reliability of findings. In yet another topic, we'll also show you how to modify variables to make them more manageable.

2. In exercise 1 of the “Political
Science as a Social Science” topic, you were asked to come up with
hypotheses that might help explain party identification. Using "partyid3" as the dependent variable, construct contingency
tables to test the following hypotheses, along with any others you can think of:

Men are more
likely than women to identify as Republicans, while women are more likely than men to identify as Democrats.

Southerners
and midwesterners are more likely than others to identify
as Republicans; northeasterners and westerners are more likely than
others to identify as Democrats.

Respondents from union households are more likely than those from nonunion households to identify as Democrats.

Married
respondents are more likely than others to identify as Republicans.

The more regularly people attend religious services, the more likely they are to identify as Republicans.

Whites are
less likely than others to identify as Democrats.

Were the results of these exercises pretty much what you expected, or were there any surprises?

[1] On occasion, crosstabs are used, not to test hypotheses, but for descriptive purposes, and this general rule does not apply. For example, the 2008 American National Election Study Subset (see codebook) includes several variables thought to measure political efficacy (a person's belief that he or she can have an impact on politics). If these variables are valid measures of the same underlying concept, there should be a relationship between answers to one question and those to another, but we are not hypothesizing that either one depends on the other. In trying to see whether this is the case, we may decide to look, for each cell in the table, at the percent of the total table rather than of either the row or column.

[2] A more systematic method for assessing the reliability of
percentages in a crosstab is discussed under the topic of contingency table analysis.