NAME: U.S. Senate Votes on Clinton Removal
TYPE: Census
SIZE: 100 observations (Senators), 10 variables
DESCRIPTIVE ABSTRACT:
For each U.S. Senator, his or her votes on whether to remove President
Clinton on each of the two articles of impeachment (plus a summary
variable representing each Senator's number of "guilty" votes) are
provided, as well as each Senator's values on several variables that
could be predictive of vote (e.g., Senator's degree of conservatism,
how well Clinton did in the Senator's state in the 1996 Presidential
election).
SOURCES:
Senators' votes on removal were obtained from the _USA Today_
website (http://usatoday.com/news/index/clinton/senvote2.htm)
Senators' degree of conservatism was obtained from ratings issued
by the American Conservative Union
(http://www.conservative.org/new_ratings/1997/97senate-preview.htm)
Other information needed to create the variables (Senators' party, when
first elected, when up for re-election, Clinton's percentages in
Senators' states in 1996) can be obtained from numerous political
almanacs and general almanacs put out annually by many news
organizations. This information is also available on several
websites. One that contains most of this information is
http://www.vote-smart.org
VARIABLE DESCRIPTIONS:
Columns
1 - 8 Name of senator
10 - 11 State (postal code)
13 Vote on Article I, Perjury: 0 = Not Guilty, 1 = Guilty
15 Vote on Article II, Obstruction of Justice: 0 = NG, 1 = G
17 Number of votes for guilt
19 Party: 0 = Democrat, 1 = Republican
21 - 23 Senator's degree of ideological conservativism (0-100)
25 - 26 Percent of the vote Clinton received in the 1996 Presidential
election in each state
28 - 31 The year each Senator's seat is up and he/she must run for
re-election (or retire)
33 First-term senator? 0 = no, 1 = yes
SPECIAL NOTES:
Name of Senator was limited to eight characters, so some names are cut
off. Also, because multiple Senators often have the same or similar
last names, nicknames were sometimes created to avoid confusion. For
example, there is both a Tim Hutchinson and a Kay Bailey Hutchison; the
former is referred to as "timhutch" and the latter as "kaybhut."
Each Senator's degree of ideological conservativism is based on 1997
voting records as judged by the American Conservative Union (see
SOURCES above), where 100 is most conservative. For Senators who were
first elected in November 1998, I came up with various substitutions to
give them an ideology score, to avoid missing data. Contact me for
details, if interested.
STORY BEHIND THE DATA:
On February 12, 1999, for only the second time in the nation's history,
the U.S. Senate voted on whether to remove a President, based on
impeachment articles passed by the U.S. House. Dozens of political
talk shows featured analyses of why Senators may have voted the way
they did, but such discourse was rarely (if ever) informed by
systematic statistical analysis of the votes. This dataset allows for
such analysis. Further, the magnitude of this event should ensure that
classroom students have some familiarity with it, making the dataset a
nice one for illustrating statistical principles.
PEDAGOGICAL NOTES:
These data can be used to illustrate both advanced and introductory
types of statistical analyses. In terms of advanced techniques, the
main approach would be to use multiple variables to predict Senators'
votes on each of the two counts. Given the dichotomous nature of the
vote variables, you would run a logistic regression for the vote on
each count, one analysis with Article I as the dependent variable, and
one with Article II (however, a logistic regression for Article II
reveals a "perfect fit" when the conservatism score is used as one of
the predictors). The "number of guilty votes" is an ordinal variable
(0, 1, or 2) which might be used for illustrating logistic regression
for an ordinal response. Another important concept for any type of
multiple regression technique is multicollinearity, namely that when
two or more predictor variables are highly correlated with each other,
this can make the estimates and tests for individual coefficients very
unstable. In the present dataset, the political party and conservative
rating are correlated (r = .906) with each other, so it would be very
questionable to use both as predictors in the same regression
equation.
As noted above, these data can also be used in teaching introductory
statistics. Two-way cross-tabulations and the chi-square test can be
used for categorical variables, such as "party" by the vote on Article
1, Perjury. Relationships between quantitative and categorical
variables can also be illustrated, such as by comparing conservatism in
Democrats versus Republicans. This could be done either graphically by
plotting the frequency distribution of the quantitative (conservative)
variable on the same scale of magnitude separately by groups of the
categorical variable (party), or statistically with an independent
samples t-test.
Finally, broader statistical issues can also be addressed in class
discussions, such as the difference between a sample and a population.
Some might argue that because the entire population of U.S. Senators
was studied, there would be no need for significance tests that use
sample statistics to make inferences for the larger population.
SUBMITTED BY:
Alan Reifman
Department of Human Development and Family Studies
College of Human Sciences
Texas Tech University
Lubbock, TX 79409-1162
AReifman@hs.ttu.edu