A STATISTICAL ASSESSMENT OF BUCHANAN'S VOTE IN PALM BEACH COUNTY
Richard L. Smith
Abstract:
One of the strongest allegations of voting irregularities in the recent
Presidential election was that a faulty ballot design in Palm Beach
County, Florida, caused many votes to cast their ballots for the Reform
Party candidate Patrick Buchanan, when they intended to vote for Al
Gore. We examine this statistically, by performing a regression analysis
of Buchanan's vote over all 67 counties of Florida, using demographic
variables (population size, race, age distribution, education level
and income) and votes for other candidates as covariates. A critical
point of statistical implementation is to choose a suitable transformation
of the response variable to achieve approximate homoscedasticity across
counties. After considering a number of alternatives, a cube root
transformation (of the number of votes cast for Buchanan) is chosen.
Variable selection is performed using either backward selection or the
Mallows C(p) criterion, leading to similar models. The results confirm
that, if the regression model is fitted using all 67 counties, then Palm
Beach is an enormous outlier, with a studentized residual of 13.3, for a
theoretical significance level of around 10 to the power -67
under the standard
normal-theory assumptions. If the
regression model is fitted to the remaining 66 counties and used to predict
the Buchanan vote in Palm Beach, we obtain a point prediction of 326 and
a 95% prediction interval of (181,534), compared with the actual vote
reported in initial returns of 3,407. These results demonstrate conclusively
that the Palm Beach County vote was indeed anomalous.
Full paper:
postscript
-
postscript (A4 paper size)
-
pdf
Other analyses:
Link to Jonathan O'Keeffe's website.
Data sets:
fldat1.txt
Florida elections data set, ready to be read as an S-PLUS data frame.
fldat1.sasdat
Florida elections data set, modified for SAS programs (remove header;
recode Buchanan variable in line 50 as a missing value so
that the variable selection is not distorted by Palm Beach).
fldat2.txt
Explanations of Florida elections data set
Some of the programs used in the analysis:
fla.sas
SAS program for variable selection using C(p) and backward selection.
fla.lst
Output of previous SAS program.
fla1.sas
SAS program for detailed model fitting and prediction.
fla1.lst
Output of previous SAS program.
sfns.i
S-PLUS functions to create studentized residuals and DFFITS from a data
object created by 'lm' (run this once, before any other program
requiring these quantities)
fl1.i
S-PLUS program for data plots, model fitting and various outlier
and influence diagnostics. (Warning: This takes some time to run!
Edit the line nsim<-1000, defining the number of simulations, to
something smaller if you just want to try the program out.)
fl3.i
S-PLUS program to create plot of rescaled RSS against transformation
parameter lambda.
Return to Richard
Smith's page
Department of Statistics
University of North Carolina, Chapel Hill
rls@email.unc.edu