Posts

Showing posts from 2010

Facebook's Social Network Graph

Facebook engineering intern Paul Butler visualises in R the friendship connections in facebook.

Posted via email from danbrewer's posterous

Vitamin D

Interesting. I've just started taking a 800IU vitamin D supplement on the recommendation of a colleague.

Posted via email from danbrewer's posterous

Evolution of the two-party vote during past century

Tig: text-mode interface for git

Tig is a git repository browser that additionally can act as a pager for output from various git commands.

When browsing repositories, it uses the underlying git commands to present the user with various views, such as summarized revision log and showing the commit with the log message, diffstat, and the diff.

This is exactly what I was looking for.

Posted via email from danbrewer's posterous

Alternative To The "200 Lines Kernel Patch That Does Wonders" Which You Can Use Right Away

Phoronix recently published an article regarding a ~200 lines Linux Kernel patch that improves responsiveness under system strain. Well, Lennart Poettering, a RedHat developer replied to Linus Torvalds on a maling list with an alternative to this patch that does the same thing yet all you have to do is run 2 commands and paste 4 lines in your ~/.bashrc file. I know it sounds unbelievable, but apparently someone even ran some tests which prove that Lennart's solution works.

Posted via email from danbrewer's posterous

Use Vim As A Syntax Highlighting Pager | Ubuntu Tutorials

If you would like to set up your pager to highlight text in pretty colours then you can set up vim to do the job.

Just add the following line to .bashrc:
alias vless='vim -u /usr/share/vim/vimcurrent/macros/less.vim
From Ubuntu tutorials

UK's first dedicated prostate cancer virtual biobank launched

UK's first dedicated prostate cancer virtual biobank launched

Sunday 7 November 2010

National Cancer Research Institute Press Release

The first virtual biobank dedicated to prostate cancer research has been launched today by the National Cancer Research Institute (NCRI).

The UK Prostate Cancer Sample Collection Database will house details of around 10,000 biological samples taken from men in the UK with and without prostate cancer.

The virtual biobank will also hold other materials that will be useful for research, including DNA, RNA, blood and urine, increasing the total to around 100,000 samples.

Data on prostate cancer risk, how cancers have responded to treatment and the molecular make-up of the cancers will be anonymised and available for scientists to use in collaborative studies.

The database has been developed by The ProMPT (Prostate cancer mechanisms of progressions and treatment) Collaborative and the Southern Prostate Cancer Collaborative, which are funded by NCRI partners.

Additional sample databases from prostate researchers outside the Collaboratives are being added to make the virtual biobank an essential resource for all scientists working to turn prostate cancer discoveries in the lab into better treatments for patients.

Dr Hayley Whitaker, a prostate cancer researcher from Cancer Research UK’s Cambridge Research Institute who helped develop the database, said: “One of the biggest challenges in prostate cancer treatment is identifying the men with aggressive prostate cancers who should be treated, as opposed to others with non-aggressive tumours who could be monitored.

“This new biobank holds both clinical and molecular data, which could help us find a marker to help doctors make this difficult decision.”

Biobank lead developer Dr Daniel Brewer from The Institute of Cancer Research said: “Until now, UK prostate cancer scientists have generally been limited to conducting research on patient samples they could acquire themselves or through collaborations they forged themselves. This biobank will help improve scientists’ access to precious samples and hence increase the accuracy of results and make new discoveries more likely.”

Dr Jane Cope, director of the NCRI, said: “This is a really important resource for prostate cancer researchers. The new database will help ensure that scientists are able to make best use of samples donated by patients, avoiding waste and speeding up progress in understanding the disease and improving treatment.”

Posted via email from danbrewer's posterous

Format and clean your data with Google Refine

Looks like "Google Refine" could be very useful for sorting out messy data.

Posted via email from danbrewer's posterous

Osborne will escape public wrath if Labour lets him win the blame game | Jonathan Freedland | Comment is free | The Guardian

If Labour's spending was so wildly out of control, why did the Tories promise to match their plans, pound for pound, all the way until November 2008? Why didn't Osborne and Cameron howl in protest at the time?

Could it be because things were not actually that bad? A quick look at the figures confirms that, until the crash hit in September 2008, the levels of red ink were manageably low. The budget of 2007 estimated Britain's structural deficit – that chunk of the debt that won't be mopped up by growth – at 3% of gross domestic product. At the time, the revered Institute for Fiscal Studies accepted that two-thirds of that sum comprised borrowing for investment, leaving a black hole of just 1% of GDP. If the structural deficit today has rocketed close to 8%, all that proves is that most of it was racked up dealing with the banking crisis and subsequent slump – with only a fraction the result of supposed Labour profligacy. After all, even the Tories would have had to pay out unemployment benefit.

Interesting article from Jonathan Freedland. Basically saying that the Government is strongly blaming Labour for the mess, so they do not get blamed, but in actual fact it was the banking crisis that is the main culprit.

"it is a question of fact that we entered this financial crisis with low inflation, low interest rates, low unemployment and the lowest net debt of any large G7 country" - Ed Balls

Posted via email from danbrewer's posterous

The true size of Africa

BIG.
"Online maps that we use for directions use the Mercator projection, and this tends to dictate how we perceive the size of countries and continents. If you look at the world map on Google, for example, Africa doesn't look that much bigger compared to China or the United States. In reality though, it's a lot bigger. Kai Krause scales countries by their area in square kilometers and then fits them into a Africa's borders for some perspective." - http://flowingdata.com

Posted via email from danbrewer's posterous

CycleStreets

This website is amazing for any of you cyclists out there. It is journey planner that is optimised for cyclists. It picks up on bridal ways etc. and even gives you a choice of the fastest or quietest routes (somehow it knows whether a road is busy or not). Not only that it gives you a trace of the hills and tries to minimise the amount of up hills. Brilliant stuff powered by OpenStreetMap. It has revealed a slightly different route home that could be quieter but the same distance. I am going to try it tonight and see how it goes.

There is also a iphone app which is reviewed at the Guardian.

How to Free Up a Lot of Disk Space on Ubuntu Linux by Deleting Cached Package Files

This is a useful article about how to get some space back on a linux box.

sudo apt-get clean

Posted via email from danbrewer's posterous

Clinical trial confirms effectiveness of simple appetite control method

"Has the long-sought magic potion in society's "battle with the bulge" finally arrived? An appetite-control agent that requires no prescription, has no common side effects, and costs almost nothing? Scientists today reported results of a new clinical trial confirming that just two 8-ounce glasses of the stuff, taken before meals, enables people to shed pounds. The weight-loss elixir, they told the 240th National Meeting of the American Chemical Society (ACS), is ordinary water."

Not really surprising is it.

Posted via email from danbrewer's posterous

We are running out of helium - New Scientist

According to a nobel prize winner, we will run out of helium in 25 years. That is shocking, I never realised that helium wasn't produced in abundance by some industrial process. Apparently, there is no chemical means to make helium and the supplies we have come from radioactive alpha decay in rocks.

Posted via email from danbrewer's posterous

An illustration of the Brewer Clan

Our friend Sarah did a marvellous illustration of us at the weekend. I think she captures the chaotic joyous nature of my clan really well. I really love her stuff and she is now really in demand which is great. Check out her blog and her books ... Morris is genius and is done with the "purple ronnie" chap.

Mr.doob | Cluttered Desk - Box2D Stress Test


I absolutely love this. I could watch it for hours. Loads of other good stuff on this site too.

Weight loss


Well it was just over a year ago that I started my latest weight loss drive and I have managed to lose just over two stone. Not too bad and the trend still seems to be downward which is encouraging despite the amount of treats I have. I think that cycling every day, even in the winter, has made the biggest difference, as well as cutting down on the amount of flapjacks I consume at work. That said I am still official overweight and would like to lose a bit more blubber. By posting this I am hoping that I can keep focussed on it.

Painless way to install a new version of R? - Stack Overflow

"My preferred method on Windows (upgrading 2.10.1 to 2.11.0):
  • Install R-2.11.0
  • Copy R-2.10.0/library/* to R-2.11.0/library/
  • Answer "no" to the prompts asking you if it is okay to overwrite.
  • Start R 2.11.0 and then type
  • update.packages()"

1
Or update.packages(checkBuilt=TRUE)Marek Apr 22 at 15:38
2
Or update.packages(checkBuilt=TRUE, ask=FALSE) :-P – gd047 Apr 22 at 16:20

BBC 6music might be saved!

BBC Trust - Lyons sets out initial conclusions on future direction of the BBC: "The Trust concludes that, as things stand, the case has not been made for the closure of 6 Music."

Whoop!

Jaroslav Stark obituary | The Guardian

This is my PhD supervisor's obiturary. Tragic loss.

Posted via web from danbrewer's posterous

Uber detailed London map satire


Love this, could waste a lot of time cruising around. Go here for the full map: http://www.bl.uk/magnificentmaps/map4.html

Posted via web from danbrewer's posterous

Lifehacker Pack 2010: Our List of Essential Windows Downloads

Use SQL queries to manipulate data frames in R with sqldf package

Just a reminder that this exists and looks useful.

Interrupting R processes in Ubuntu

Interrupting R processes in Ubuntu: "
  • Pressing Ctrl+C should work in the terminal.
  • If that doesn't work, open another terminal console and type
    ps aux | grep R

    kill -s INT PID
  • The first line allows you to discover the PID number of your particular R instance, which is then used in the second line.
  • In the second line above, INT may be replaced by HUP in some cases.
  • The above methods set up an alert-thing to tell the program to stop. When the computations are done externally of R, it can't be executed before the external codes checks back with R. If the external code doesn't do this regularly or at all, killing the entire program is the only way out. If none of the above work, this is probably what has happened and it may be a good idea to let the package author know about the problem..."

Darryl Cunningham Investigates: The Facts In The Case Of Dr. Andrew Wakefield

This is a marvellous comic about the whole MMR scandal.

Posted via web from danbrewer's posterous

Droopy Creates Instant Servers for Large File Trading

KiTTY Adds Session Saving, Portability, and More to PuTTY - Windows - Lifehacker

The sweet smell of success

Ajaxload - Ajax loading gif generator

IE App Compat VHD

IE App Compat VHD: "VPC Hard Disk Images for testing websites with different Internet Explorer versions on Windows XP and Windows Vista"

I am going to use these images to test IE6 on a website I am producing for work. Apparently, these should work fine on the free VirtualBox.

If you are running windows http://spoon.net/browsers/ looks a good way forward.

The Tell-Tale Beat

You fancy me mad. Could a madman have outsmarted the greatest electronica/techno artists of our era? Next to fall will be Roderick Usher's house/trance band.

g3data - extracting data from graphs in images

Remember kids always label your axes

Daily Mail-o-matic

Daily Mail-o-matic

A new Daily Mail headline every time you click the button.

Election Special For a limited time only, every headline is about Nick Clegg.

COULD NICK CLEGG DESTROY YOUR MORTGAGE?

Posted via web from danbrewer's posterous

General election 2010: poll of polls



The Guardian have got a graphics display that combines the results from all the polls and smooths it with a three day moving average. Look at that massive upswing for the Liberal Democrats of about 10% after the TV debate - amazing. I didn't think that the debates would have any noticeable effect. And why on earth is one of the debates on Sky? Murdoch kissing methinks.

Statistical Problems to Document and to Avoid

Some links to statistical guidelines:

multicd.sh

"multicd.sh is a shell script designed to build a multiboot CD image containing many different Linux distributions and/or utilities."

Posted via web from danbrewer's posterous

Jeroen Ooms’s ggplot2 web interface – a new version released (V0.2)

Election map and swingometer | guardian.co.uk

Interesting interactive plot of how different constituencies will change depending on the swing in votes (to a maximum of 10%). With a 10% swing in any direction there is no chance that my MP will be from a different party to the current one.

Posted via web from danbrewer's posterous

General election 2010: the 10 key datasets to help you decide

From the Guardian:
"

With the general election 2010 campaign well and truly underway, it's easy for the key facts to get lost in a barrage of propaganda, counter-accusations and obsfucation, as Labour, the Conservatives and Liberal Democrats battle for the key marginals.


1. Where's your constituency? Every seat listed

This election will see hundreds of new constituencies as boundary changes come into effect. These figures, compiled by the Press Association, identify every new constituency, and the votes needed to win it.


2. How broken is Britain? Civic pride by local authority

How much pride do you have in your area? These numbers measure community involvement on every level.

3. How much does the government spend?

With cutting public spending a key election issue, these are the most comprehensive figures, department by department.

4. Which departments will be cut?

Straight after the budget, government departments announced sometimes swinging cuts. Find out which ones are the losers


5. What did my MP claim in expenses?

Full list of every MP, complete with constituency IDs and expenses claims. See how they add up.

6. How bad is inequality in the UK?

Have things got better or worse under Labour? See what the data says.

7. How strong is the BNP in my constituency?

BNP membership, seat by seat.

8. How much do public sector workers earn?

For most, it's not very much. Get the full data and see for yourself.

9. Can my newspaper support win the election?

Find out which newspapers supported which parties - and what it did to the outcome.

10. How big is the deficit really?

See how it got this big and what could bring it down.

"

Revolutions: Scientists misusing Statistics

In ScienceNews this month, there's controversial article exposing the fact that results claimed to be "statistically significant" in scientific articles aren't always what they're cracked up to be. The article -- titled "Odds Are, It's Wrong" is interesting, but I take a bit of an issue with the sub-headline, "Science fails to face the shortcomings of Statistics". As it happens, the examples in the article are mostly cases of scientists behaving badly and abusing statistical techniques and results:

  • Authors abusing P-vales to conflate statistical significance with practical significance. A for example, a drug may uncritically be described as "significantly" reducing the risk of some outcome, but the the actual scale of the statistically significant difference is so small that is has no real clinical implication.
  • Not accounting for multiple comparisons biases. By definition, a test "significant at the 95% level" has 5% chance of having occurred by random chance alone. Do enough tests, and you'll find some indeed indicate significant differences -- but there will be some fluke events in that batch. There are so many studies, experiments and tests being done today  (oftentimes, all in the same paper)that the "false discovery rate" maybe higher than we think -- especially given that most nonsignificant results go unreported.

Statisticians, in general, are aware of these problems and have offered solutions: there's a vast field of literature on multiple comparisons tests, reporting bias, and alternatives (such as Bayesian methods) to P-value tests. But more often than not, these "arcane" issues (which are actually part of any statistical training) go ignored in scientific journals. You don't need to be a cynic to understand the motives of the authors for doing so -- hey, a publication is a publication, right? -- but the cooperation of the peer reviewers and editorial boards is disturbing.

ScienceNews: Odds Are, It's Wrong

Posted via web from danbrewer's posterous

Hadley Wickham gives a short course on graphics with R

Revolutions: Hadley Wickham gives a short course on graphics with R: "Hadley Wickham (the creator of the popular ggplot2 graphics package for R) has posted video of a 2-hour short course on Visualisation in R at his blip.tv channel"

Ordnance Survey launches free downloadable maps

Wow! I hope that this report that OS maps data will be free from now on isn't some sort of April fools joke. Seems to be reported by all the major outlets though. Amazing news if it is true.

ffe (Flat file extractor) - Tool for parsing flat and CSV files and converting them to different formats | Ubuntu Geek

Just to remind me that this tool exists.

Scott Pilgrim vs. The World Trailer

Oh yeah. This one looks like its going to be cool. From Edgar Wright.

Posted via web from danbrewer's posterous

64-bit RMySQL on Mac OS X 10.6

There are currently problems with the pre-compiled binary of RMySQL for the 64bit version of R on macs. To get round this you have to download the source and run the following command:

R CMD INSTALL --configure-args='--with-mysql-dir=/usr/local/mysql --with-mysql-inc=/usr/local/mysql/include --with-mysql-lib=/usr/local/mysql/lib' RMySQL_0.7-4.tar.gz

This is based on a post at Andrew's musings

OK Go - This Too Shall Pass

Another super fine video from OK Go. Check it out.

Visualising the deficit ahead of the budget | guardian.co.uk

Nice plot done by the Guardian, showing what a poor state the UK's finances are in. Interesting that the net debt as a percentage of GDP has been rising since 2002, so we were going in the wrong direction even before the recession.

Tune MySQL like a pro with MySQLTuner

This looks a marvellous script that helps you optimise your mysql server easierly without any knowledge what so ever - just what I want.

Posted via web from danbrewer's posterous

Using dropbox as a git repository

Git on Dropbox

Rather than creating a repository and working copy in the Dropbox directory, this time I wanted to create a Git bare repository in Dropbox. Based on this excellent article, I was able to accomplish this in a matter of minutes. Here are the steps:

  1. I already had a Git repository in ~/Documents/livemesh/myproject. Before doing anything, I ensured everything was committed.
  2. cd ~/Dropbox
  3. git clone --bare ~/Documents/livemesh/myproject myproject
    (this created a bare repository in ~/Dropbox/myproject)
  4. cd ~/dev
  5. git clone ~/Dropbox/myproject myproject
    (this made my Git working copy in ~/dev/myproject)

Now I can do my day-to-day work in ~/dev/myproject. After committing any new edits, I can type git push to send my changes to Dropbox. On the other computer, I can receive changes by typing git pull.

So far Git has been far easier than I imagined, therefore I am kicking myself for not learning it sooner. Since I’m rambling, I’ll point you to git-osx-installer, which makes Git installation trivial on OSX.

Posted via web from danbrewer's posterous

Trending Topics on Wikipedia



Someone has looked at the page views on wikipedia and come up with a summary where you can see what the current most popular topics are and the fastest rises in popularity. Additionally, you can enter any topic and see how its popularity has varied. Nice.

Official Google Reader Blog: ​And now for something completely different

I’m happy to announce an experimental product from the Google Reader team that makes the best stuff in Reader more accessible for everyone, while giving Reader users a new way to view their feeds. It’s called Google Reader Play, and it’s a new way to browse interesting stuff on the web that’s easy to use and personalized to the things you like. Best of all, there’s no set-up required: visit google.com/reader/play to give it a try.

Google Reader Play screenshot

Another interesting way to waste time created by the good folks at Google.

Posted via web from danbrewer's posterous

New Google Tool Visualizes Public Data in Animated Charts

New Google Tool Visualizes Public Data in Animated Charts

Google has just launched Google Public Data Explorer. The new Google Labs tool offers a visual way to look at and analyze large public data sets on a variety of popular search topics.

The tool is specifically designed for avid data crunchers like students, journalists, policy makers, and could be seen as Google’s prettified approach to a user-driven computational search engine (think Wolfram Alpha). Public Data Explorer is its own dedicated utility that expands and improves upon existing functionality added to the search experience last year.

Wow, this is amazing.

Infographic: Famous Movie Quotes

How to Setup Your Own Web Proxy Server For Free with Google App Engine

Interesting idea to use Google app engine to host a proxy. The only thing is that they do not seem that advanced, so no chance of getting Hulu yet.

Posted via web from danbrewer's posterous

Comparing: BBC Radio 1 to BBC 6 Music - Compare My Radio

BBC Radio 1 played 1,040 unique tracks

BBC 6 Music played 3,193 unique tracks

The stations share 170 unique tracks

BBC Radio 1 shares 5% of its playlist with BBC 6 Music

BBC 6 Music shares 5% of its playlist with BBC Radio 1

BBC Radio 1 and BBC 6 Music have 4% of their combined unique playlist which is the same (which is 170 unique tracks)-->

Good site that compares the playlists of various stations. Makes 6 music look pretty unique.

Posted via web from danbrewer's posterous

Dan Bull - Dear Auntie [an open letter to the BBC about 6 music]

BBC 'to axe radio stations and halve website' in strategic review | Media | The Guardian

Mark Thompson

BBC director general Mark Thompson. Photograph: Richard Saker

The BBC plans to axe two radio ­stations – 6 Music and Asian Network – cut spending on imported shows and halve the size of its website, it is claimed today. The Times says the measures are part of the BBC's strategic review to be unveiled next month. Under the plan, the BBC intends to shrink overall services and focus more on quality over quantity. There have already been reports suggesting that the BBC will axe the digital radio stations 6 Music and Asian Network.

Arrrrrrrrrggggggghhhhhhhhh. Don't do it. This may be a rumour but it is such bad news it makes me want to scream. What radio will I listen to if 6 music goes?




Andrew Collin's take.

How to Find & Replace Data in MySQL

How to Find & Replace Data in MySQL

To find a string in a certain field and replace it with another string:

update [table_name] set [field_name] = replace([field_name],'[string_to_find]','[string_to_replace]');

Posted via web from danbrewer's posterous

Photosynthesis uses quantum interactions to harvest light

By some measures, the photosynthetic process is one of the more efficient energy transactions in nature. Scientists have taken an interest in figuring out how it works at the atomic level, as some research had suggested that quantum mechanics might be at work when the system was examined at low temperatures. A new experimental setup using photosynthetic proteins shows that, when they are stimulated with light, they interact on a quantum level: their states are dependent on one another, which allows them to transmit energy efficiently.

Posted via web from danbrewer's posterous

Charlie Brooker - How To Report The News

Head of the league

Whoop! I am top of my work's fantasy football league. That's a surprise but pleasing.

An open letter to Apple from the fanboys | BitterWallet

eigenclass - A better backup system based on Git

A better backup system based on Git

A fast, powerful backup system built upon Git and efficient, compact tools written in OCaml (faster than the C counterpart with 1/5th of the code :)

UPDATE (2008-03-31) gibak 0.3.0 released

Recent events have pushed me to get serious about backing up my data. I'm naturally inclined to use simple solutions over specialized backup systems, preferring something like rsync to a special-purpose tool. As far as "standard" tools go, however, git provides a very nice infrastructure that can be used to build your own system, to wit:

  • it is more space-efficient than most incremental backup schemes, since it does file compression and both textual *and* binary deltas (in particular, it's better than solutions relying on hardlinks or incremental backups à la tar/cpio)
  • its transport mechanism is more efficient than rsync's
  • it is fast: recoving your data is *faster* than cp -a
  • you keep the full revision history
  • powerful toolset with a rich vocabulary

Interesting

Posted via web from danbrewer's posterous

Google Docs: Download

The GDD python script backs up your entire archive of google documents and can be put in a cron job to make this occur automatically at perscribed times.

Posted via web from danbrewer's posterous

Make your own David Cameron poster

Brilliant stuff.

Found via Jake

Posted via web from danbrewer's posterous

xkcd - Dirty Harry meets Rain Man

First-Person Tetris

AHhhh, you got to love it.

Posted via web from danbrewer's posterous

The Holocaust We Will Not See

The Holocaust We Will Not See: "Avatar, James Cameron’s blockbusting 3-D film, is both profoundly silly and profound. It’s profound because, like most films about aliens, it is a metaphor for contact between different human cultures. But in this case the metaphor is conscious and precise: this is the story of European engagement with the native peoples of the Americas. It’s profoundly silly because engineering a happy ending demands a plot so stupid and predictable that it rips the heart out of the film. The fate of the native Americans is much closer to the story told in another new film, The Road, in which a remnant population flees in terror as it is hunted to extinction." By George Monbiot

BBC News - Reporter breaks an 'unbreakable' mobile phone at CES

A census of amplified and overexpressed human cancer genes : Nature Reviews Cancer

This paper, that I am an author on, was released on 24th December 2009: "Integrated genome-wide screens of DNA copy number and gene expression in human cancers have accelerated the rate of discovery of amplified and overexpressed genes. However, the biological importance of most of the genes identified in such studies remains unclear. In this Analysis, we propose a weight-of-evidence based classification system for identifying individual genes in amplified regions that are selected for during tumour development. In a census of the published literature we have identified 77 genes for which there is good evidence of involvement in the development of human cancer."

I spent a huge amount of time on this one but I am still pretty low on the author pecking order. This is for a super high impact journal, "Nature Reviews Cancer" which has a 2008 ISI Impact Factor of 30.762.

Also check out the associated website which I put together just before Christmas. Hopefully, this work will get further funding and get regularly updated.