Showing posts from April, 2010

Daily Mail-o-matic

Daily Mail-o-matic

A new Daily Mail headline every time you click the button.

Election Special For a limited time only, every headline is about Nick Clegg.


Posted via web from danbrewer's posterous

General election 2010: poll of polls

The Guardian have got a graphics display that combines the results from all the polls and smooths it with a three day moving average. Look at that massive upswing for the Liberal Democrats of about 10% after the TV debate - amazing. I didn't think that the debates would have any noticeable effect. And why on earth is one of the debates on Sky? Murdoch kissing methinks.

Statistical Problems to Document and to Avoid

Some links to statistical guidelines:

" is a shell script designed to build a multiboot CD image containing many different Linux distributions and/or utilities."

Posted via web from danbrewer's posterous

Jeroen Ooms’s ggplot2 web interface – a new version released (V0.2)

Election map and swingometer |

Interesting interactive plot of how different constituencies will change depending on the swing in votes (to a maximum of 10%). With a 10% swing in any direction there is no chance that my MP will be from a different party to the current one.

Posted via web from danbrewer's posterous

General election 2010: the 10 key datasets to help you decide

From the Guardian:

With the general election 2010 campaign well and truly underway, it's easy for the key facts to get lost in a barrage of propaganda, counter-accusations and obsfucation, as Labour, the Conservatives and Liberal Democrats battle for the key marginals.

1. Where's your constituency? Every seat listed

This election will see hundreds of new constituencies as boundary changes come into effect. These figures, compiled by the Press Association, identify every new constituency, and the votes needed to win it.

2. How broken is Britain? Civic pride by local authority

How much pride do you have in your area? These numbers measure community involvement on every level.

3. How much does the government spend?

With cutting public spending a key election issue, these are the most comprehensive figures, department by department.

4. Which departments will be cut?

Straight after the budget, government departments announced sometimes swinging cuts. Find out which ones are the losers

5. What did my MP claim in expenses?

Full list of every MP, complete with constituency IDs and expenses claims. See how they add up.

6. How bad is inequality in the UK?

Have things got better or worse under Labour? See what the data says.

7. How strong is the BNP in my constituency?

BNP membership, seat by seat.

8. How much do public sector workers earn?

For most, it's not very much. Get the full data and see for yourself.

9. Can my newspaper support win the election?

Find out which newspapers supported which parties - and what it did to the outcome.

10. How big is the deficit really?

See how it got this big and what could bring it down.


Revolutions: Scientists misusing Statistics

In ScienceNews this month, there's controversial article exposing the fact that results claimed to be "statistically significant" in scientific articles aren't always what they're cracked up to be. The article -- titled "Odds Are, It's Wrong" is interesting, but I take a bit of an issue with the sub-headline, "Science fails to face the shortcomings of Statistics". As it happens, the examples in the article are mostly cases of scientists behaving badly and abusing statistical techniques and results:

  • Authors abusing P-vales to conflate statistical significance with practical significance. A for example, a drug may uncritically be described as "significantly" reducing the risk of some outcome, but the the actual scale of the statistically significant difference is so small that is has no real clinical implication.
  • Not accounting for multiple comparisons biases. By definition, a test "significant at the 95% level" has 5% chance of having occurred by random chance alone. Do enough tests, and you'll find some indeed indicate significant differences -- but there will be some fluke events in that batch. There are so many studies, experiments and tests being done today  (oftentimes, all in the same paper)that the "false discovery rate" maybe higher than we think -- especially given that most nonsignificant results go unreported.

Statisticians, in general, are aware of these problems and have offered solutions: there's a vast field of literature on multiple comparisons tests, reporting bias, and alternatives (such as Bayesian methods) to P-value tests. But more often than not, these "arcane" issues (which are actually part of any statistical training) go ignored in scientific journals. You don't need to be a cynic to understand the motives of the authors for doing so -- hey, a publication is a publication, right? -- but the cooperation of the peer reviewers and editorial boards is disturbing.

ScienceNews: Odds Are, It's Wrong

Posted via web from danbrewer's posterous

Hadley Wickham gives a short course on graphics with R

Revolutions: Hadley Wickham gives a short course on graphics with R: "Hadley Wickham (the creator of the popular ggplot2 graphics package for R) has posted video of a 2-hour short course on Visualisation in R at his channel"

Ordnance Survey launches free downloadable maps

Wow! I hope that this report that OS maps data will be free from now on isn't some sort of April fools joke. Seems to be reported by all the major outlets though. Amazing news if it is true.