This is a really well written article about common analytical problems in science. I not sure concious p-hacking occurs as much as the author insists and I would have thought they would have mentioned multiple testing corrections. I really don't like the idea of using a large number of different methodologies and taking the aggregation of them as the result. http://fivethirtyeight.com/features/science-isnt-broken
The evolutionary history of lethal metastatic prostate cancer - press releases
Daniel Brewer from Norwich Medical School at UEA and The Genome Analysis Centre (TGAC) said: “Prostate cancer becomes lethal once it spreads from the prostate and this study has given us a unique insight into the way this occurs.” “Contrary to the established simple view of one ‘seed’ of cancer moving from the prostate to other organs, we have firmly established that the process is much more complex and dynamic. Cancer cells are continuously interchanged between all organs where cancer has been established and new sites are established both from the prostate and other organs. This step will help scientist develop new approaches in tackling this wide-spread disease.”
The evolutionary history of lethal metastatic prostate cancer
One of our papers is out today and it's in Nature. It is a genomic study that looks at the sequence of samples from multiple sites where cancer has invaded in prostate cancer patients. From this we have been able to elucidate the genetic evolution of the cancer when it spreads from the prostate. Pretty amazing stuff. http://dx.doi.org/10.1038/nature14347
Some unusual science for the day: Using modern computer science to understand just where Rock n' Roll came from. Normally, people who talk about the history of music break it down into genres which have a lot to do with marketing, country of origin, and so on, and talk about individual bands as historical influences on each other. But these boundaries can be quite arbitrary: for example, "gospel" and "rock" are considered very far apart, but if you go back to the 1950's, rock was so influenced by gospel that it was hard to tell them apart at times.
So these researchers tried something else. They analyzed the songs which topped the US charts from 1960 to 2010, about 17,000 in all. For each song, they examined features not of the marketing around the music, but of the music itself: instrumentation, chord changes, timbre, types of harmony. They then used a technique called "k-means" to find the natural clusters into which the songs fell by these measures, and found thirteen natural groupings. To understand these groupings better, they adapted a technique from molecular genetics which is used to understand the functions of genes: they took song tags from last.fm, and did a mathematical analysis to see which song tags were most strongly associated with each cluster. (For example, if one cluster had songs tagged "R&B" far more often than the other clusters did, it's a good sign that this tag describes the cluster)
They came up with 13 clusters -- what you might call the "purely musical" genres of the music, since they're based entirely on the songs' musical qualities, not on the politics or marketing around them. These ranged from cluster #2 (hip hop / rap / gangsta rap / old school) to #9 (classic rock / country / rock / singer-songwriter) to #8 (dance / new wave / pop / electronic).
The image you see a bit of below is the history of the popularity of these genres over time, with 1960 at the bottom of the graph and 2010 at the top. You can see the sudden rise of rap (leftmost column), the gradual vanishing of jazz and the blues from the charts (the dwindling figure center-right), and the coming and going of hard rock (the dark blue bubbly thing at the center).
Interestingly, they have answered one important historical question, about the significance of the British Invasion: apparently no, this was not the key catalyst of the revolution in American music; the revolution was already well underway before the Beatles arrived in 1964. (Which shouldn't really surprise people too much, given that this is where rock came from)
If you look at the bottom of the image, you'll notice a tree structure which the summary on the arXiv blog doesn't talk about; you'll have to read the article itself (http://arxiv.org/pdf/1502.05417v1.pdf) for that. It's basically a genetic tree of these genres of music. This is constructed using the same techniques of "genetic relatedness" which are used to create modern evolutionary trees of species, only instead of being based on DNA snippets, they're based on those underlying musical features like chord changes which were the basis of the clustering. So you can see (for example) that hip-hop comes from a completely different ancestry than all the other observed genres, while pairs like country and classic rock are close relatives.
Why is this interesting? Apart from the obvious fun of studying music history using the methods of molecular biology, it shows the ways in which these techniques can be used to describe a whole host of things. To make this work, what you need is a large sample of items to classify (here, songs); for each item, a large collection of features to measure (a few hundred at least; in this case, things like chord changes and instrumentation); and if you want to be able to describe the function of these features, have functional labels (here, song tags) for at least a good collection of the items you want to classify. Then you can do a "genetic analysis," grouping them into families, observing family trees, and (if you have additional data, like the year of release in this case) understand things like the evolution of these groups over time or space.
What's marvelous is that you can do this sort of analysis with all sorts of things. Do it on news articles, with the features being words, and you'll discover that they cluster into stories, which in turn cluster into subjects. (Why? Because you'll see, say, a bunch of stories with the word "Brezhnev" which also include references to the USSR, and these come and go over time, and at later times start to also include stories about "Andropov," "Chernenko," and "Gorbachev." Depending on how finely you slice these, you can either see the life of a politician, or the history of the Soviet Union.) Do it on a city's road network, with features involving the number of cars on each chunk of the road at a given time, and you'll discover... well, I'm not sure what you'll discover. I don't know if anyone's ever done that analysis. But you could do it and find out.
Here's my quote from the UEA one: Co-author Daniel Brewer, from Norwich Medical School and The Genome Analysis Centre (TGAC) at Norwich Research Park, said: “This study has sequenced the whole genetic sequences of multiple samples from the prostate for the first time - both from tumours and apparently normal tissue."
“Surprisingly there were a large number of abnormal genetic changes found in the normal prostate tissue, suggesting that the prostate as a whole is a hot bed of genetic instability and is primed and ready for tumours to develop. This gives us important clues to how prostate cancer develops and has potential consequences to how it is treated.” https://www.uea.ac.uk/mac/comm/media/press/2015/mar/colin-cooper-prostate-cells
Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue
This bit of work has been my main focus for a long time and it is really great that it has finally been published in Nature Genetics. This study, for the first time, has sequenced the whole genetic sequences of multiple samples from the prostate, both from tumours and apparently normal tissue. Surprisingly there were a large number of abnormal genetic changes found in the normal prostate tissue, suggesting that the prostate as a whole is a hot bed of genetic instability and is primed and ready for tumours to develop. This gives us important clues to how prostate cancer develops and has potential consequences to how it is treated. http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3221.html
Mutation detection in formalin-fixed prostate cancer biopsies taken at the time of diagnosis using next-generation DNA sequencing - Manson-Bahr et al. - Journal of Clinical Pathology
This is a methodology paper from us that describes a new technique for obtaining DNA from prostate cancer biopsy tissue stored in FFPE. We go on to show that the quantity and quality of DNA is good enough for targeted next-generation sequencing to be performed reliably. This is important because it shows that targeted sequencing can be used as a test in the clinic without changes to the pathology processing that is currently performed in hospitals. http://jcp.bmj.com/content/early/2015/01/13/jclinpath-2014-202754.abstract