Friday, March 8, 2013

Election fraud detection in Armenia and in Flanders

Last week, a tweet by the Dutch political scientist, Armen Hakhverdian (@hakhverdian), pointed to an interesting blog post from Fredrik M Sjoberg, a Postdoctoral Scholar at Columbia University – The Harriman Institute. It's a guest post on The Monkey Cage dealing with the recent election in Armenia and the (alleged) election fraud. One of the things he did was a very simple test. He did a $\chi^2$-test based on the assumption that:
In the absence of manipulation of vote totals the last digit should follow a uniform distribution of 10 percent in each of the 0-9 digit categories
He did that for the ruling party at the polling station level both for 2012 (no fraud allegations) and 2013 (fraud allegations). The results are summarized by the graph below (copied from the original blog post).

The $\chi^2$-test for 2013 turns out to be significant at the 0.1 percent level, but non-significant in 2012 ($p$-value = 0.981).
Apart from the fact that I'm always a bit worried when significance tests are used on population data, especially because the $\chi^2$-test is known to be sensitive for the number of observations $N$, I still think it is an interesting take on election fraud. While the absence of a significant effect is not a guarantee that no fraud happened, the fact remains that deviating strongly from a uniform distribution would raise eyebrows.
Recently I reported on this blog about a study I did on the effect voting machines had on the outcome of the municipal elections in Flanders, Belgium (see here (english) and here (dutch)). Since I have the data, I thought I might as well do a similar test as was done for Armenia, but this time for the Flemish part of Belgium (excluding Brussels), and at the level of the candidate rather than at the level of the polling station. I'll admit that the execise is futile, since nobody claimed that fraud was involved in the Belgian elections, but it turns out that you can do the test with only a few lines of R-code, so why not? This results in a $\chi^2=7.5103$, which leads to a non-significant effect with a $p$-value of 0.5841. But just simply looking at the bar chart below with the proportions makes it also clear that, w.r.t. to the "last digit-criterion" everything looks good.

So the good news is that, in contrast to Armenia, Flanders has no explaining to do w.r.t. to the "last digit-criterion". That said, I'm still surprised that the study on the "touchscreen effect" as described in a previous post has drawn more attention abroad than in Flanders and more from statisticians and data scientists than from political scientists and journalists.