Cancer gene sequencing effort struggles through waves of false IDs

Muscle proteins, smell receptors show up in some putative lists.

With the development of DNA sequencing centers that are capable of churning out multiple genomes in a week, many scientists saw a resource that they could turn against cancer. By sequencing a person's healthy cells and comparing those results to the sequence of their cancer cells, it would be possible to map all the genetic changes that drive cancers. Within the list of genes, there might also be hints for future therapies.

As the cancer genomes have rolled in, however, reality hasn't kept pace with the promise. As the number of cancer genomes sequenced has risen, the number of genes identified has continued to grow. And as noted by the authors of a paper released by Nature over the weekend, some of the genes are overwhelmingly unlikely to have anything to do with cancer. So a huge team of researchers set out to find out why and to fix the problem.

Although some cancers are caused by viruses, the majority of cases are caused by mutations that alter or disable the genes that normally control a cell's growth. Many of these have been identified over the years: some that are common to many cancers, others that are specific to just a few. Until recently, there was no way to be sure we had a complete catalog of the genes involved, or knew which ones were important in which cancers. Genome sequencing gave us the chance to develop a complete catalog.

Which mutations are relevant?

The challenge of this approach is that cancer cells carry a lot of mutations. They are constantly adapting to the body's (and doctors') attempts to kill them and mutations are the raw materials for that. As part of their transformations, they also tend to disable the genes that stop cells from dividing if they carry DNA damage. Both of these factors tend to mean that cancer cells have an increased rate of mutations. But these mutations are indiscriminate; they hit irrelevant genes with the same frequency as they hit genes important for cancer's origin and spread.

So, the people doing cancer genomics faced a challenge in trying to weed out the irrelevant mutations and focus on the significant one. For the most part, they were failing.

To illustrate the problem, the authors of the new paper took their own set of normal and cancerous samples from 178 patients with lung cancer. The standard computer analysis used to identify mutations pulled out 450 genes that were mutated at a higher frequency in the cancers, even after accounting for the size of the gene. That's a lot. And some of them were clearly irrelevant to cancer. Nearly a quarter of the 450 genes encoded odorant receptors, which are the basis of your sense of smell, but not expressed anywhere much beyond the nerves of the nasal lining. Other nerve-specific genes were on the list, as were a few that play a structural role in muscles.

Other types of cancer had similarly large lists filled with genes that were probably irrelevant. And a scan of the published literature revealed that many of these had already been reported as associated with cancer.

Why are so many labs being led astray? To sort things out, the authors obtained a large collection of genomes from 27 different types of cancer, and started doing comparisons among them. The first thing they noticed is that different cancer types varied in the frequency of mutations by factors of up to 1,000. Lung cancers and melanomas were at the high end, with rates up to and exceeding one mutation every 10,000 bases. That's likely because these cancers are largely caused by known mutagens—cigarette smoke and UV light, respectively.

Those mutagens are also fairly specific about how they damage the DNA (for example, UV light tends to damage DNA when two Ts are next to each other),so they tended to favor a specific spectrum of mutations. The same was true in some other types of cancer, which suggests they might have a common environmental cause.

In addition to the type and frequency of mutations, there were other variables. Mutation rates could vary greatly among individuals with cancer, so that lung cancers from two different patients might show very different rates. And different areas of the genome were more or less prone to mutation. Active genes seem to be resistant to mutation, possibly because the reside on a section of the chromosome that's accessible to DNA repair genes. Areas that were the last to be copied when a cell divides, in contrast, were more likely to pick up mutations.

Overall, the authors conclude that earlier studies were going wrong because they compared mutations in a gene to the average mutation rate in the genome. Instead, all these other factors—type of cancer type of mutation, the patient's mutation rate, and the region of the genome—need to be taken into account as well. Being the helpful sorts, they even wrote a program called MutSig that did so. (And made it freely available for noncommercial use.)

When MutSig was turned loose on the original data, the list of interesting genes dropped from 450 to just 11. In all probability, the same thing would happen to other data sets if they were subjected to the same analysis; not everything (or not every gene) causes cancer.

This is a great success story, but it's a bit of a silver lining in a dark cloud. 10 of the 11 genes that were identified were already known to be involved in cancer, and the 11th is involved in the immune response, which helps keep cancer in check. So it's not clear that we're getting much in the way of new answers out of a large and expensive project.

Promoted Comments

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell.

Most mutations are neutral, or have small effects. Also, cancer turns off mechanisms that kill metabolically weird cells, so a mutation that is fatal in a cell with intact apoptotic mechanisms may be viable in a cancer cell.

They might evolve toward higher mutation rates in some areas of the genome (some bacteria seem to react to stress that way) but I have no idea if there's any evidence for that.

Quote:

It seems strange that they can increase the rate of mutation and thrive rather than die out.

Cancer cells also generally turn off apoptotic mechanisms.

Assuming effectively immortal cancer progenitor cells, they can spit out as many non-viable cells as they like as long as some of these turn out to be viable.

46 Reader Comments

This is effectively a non-result, but those can be just as valuable as "real results" if they prevent a lot of wasted resources. On the other hand, cheap sequencing combined with the authors' filter just might pop out an interesting gene or two in time.

In my experience it's often more interesting to use the sequencing 'spin-off' technologies like ChIP-seq, RNA-seq, and DNAse-seq to get a more directed, functional insight into the mechanisms of cancer.Although it's of course still pretty cool to use these genome wide association type studies and see an actual relevant new target pop up

This is effectively a non-result, but those can be just as valuable as "real results" if they prevent a lot of wasted resources. On the other hand, cheap sequencing combined with the authors' filter just might pop out an interesting gene or two in time.

Hopefully this work will also limit the amount of chasing down blind alleys, looking at genes/mutations that aren't, at the end of the day, likely to be viable therapeutic targets.

It appears that even conventional wisdom has figured out that behavior is as much a factor as growth in distinguishing cancer cells from normal cells. The major cell protein that is at the center of a cell's ability to move, actin is identical to the major protein in muscle, actin, at 351 out of 374 positions in its amino acid sequence. So, it is highly likely that proteins that control the activity of actin do play some role in some cancers.The reality of cellular biochemistry is clearly that the proteins and nucleic acids form a very complex system of interacting molecules. The roles of individual molecules can't be understood in isolation from the system. The system is particularly complex because it is the result of muddling through by trial and error rather than some intentional design that might be grasped through principles.There is nothing new about large scale cancer research expecting simple answers and failing to find them. A large amount of progress is being made understanding a cell's active molecules. But, it is probably going to take a relatively complete understanding of their interactions and computer models of how the system evolves in time to understand where it is vulnerable to the kinds of breakdown that cause cancer.

Well, the article does state that unimportant genes had been identified as related to cancer in the literature. Can you imagine how unsuccessful a treatment would be based on those genes? At the least the targets are refined a bit for drug development. It's a bit more in the incremental march against cancer if not a panacea.

I wouldn't be too pessimistic about the future of this type of research; after all, the number of targets and cancer types is finite, and sooner or later we'll know all we need to find the best possible treatments. Whether we get there by high-throughput studies and bioinformatics, or through testing based on knowledge of the biology, isn't so important, as long as we keep moving forward.

The question becomes: Were the 10 that were known to cause cancer known about because they were reported as such out of the 450 (Along with the others that do not actually cause cancer)?

My guess is that the 10 genes were known before the screen identifying the 450. Whether or not prior knowledge of the 450 candidate genes has aided the validation of 10 of them is irrelevant for the algorithm in question.

I wouldn't be too pessimistic about the future of this type of research; after all, the number of targets and cancer types is finite, and sooner or later we'll know all we need to find the best possible treatments. Whether we get there by high-throughput studies and bioinformatics, or through testing based on knowledge of the biology, isn't so important, as long as we keep moving forward.

Let me get this straight. They looked at the frequency of mutations, which they couldn't correlate to cancer, and then they wrote a program to take the data and tell us what we already knew, and somehow this is news?

Essentially all of this data is based on upregulation, which has huge floor effects (they are measuring very small changes in very small signals and then normalizing so it looks like a real change). Tons of stuff is upregulated based on cell growth and homeostasis.

I don't personally think that Big Data (with its attendant huge false positive rates) will ever solve this problem, there is just too much noise. My hope is that the matching of systems biology and bayesian analysis with complex experimentation could eventually elucidate the redundant pathways/virus modifications/mutations that lead to cancer.

You can take any suitably complex algorithm and train it to tell you something you already know. This doesn't mean that the algorithm has any value.

I don't personally think that Big Data (with its attendant huge false positive rates) will ever solve this problem, there is just too much noise. My hope is that the matching of systems biology and bayesian analysis with complex experimentation could eventually elucidate the redundant pathways/virus modifications/mutations that lead to cancer.

You can take any suitably complex algorithm and train it to tell you something you already know. This doesn't mean that the algorithm has any value.

This was not training an algorithm to tell us what we already know, it was developing an algorithm to filter out the noise.

Quote:

So, it's not clear that we're getting much in the way of new answers out of a large and expensive project.

The project excluded 439 suspect genes, confirmed 10, and identified 1 more. That sounds like a success. The project also generated a tool that ruled out 97.6% of false positives in this dataset and shows promise in others. This is far from a failure; this is where real science happens. The headline may be less sexy than discovering "The Cancer Gene" but this kind of analysis is what leads to a greater understanding of the actual causes of cancers and better tests.

Quote:

The same was true in some other types of cancer, which suggests they might have a common environmental cause.

Discovering patterns of formerly unknown mutagenic behavior is another huge potential benefit. This could lead to identifying formerly unknown carcinogens and linking them to types of cancer. Finding carcinogens by monitoring mutations in DNA instead of looking at population surveys could completely change research into environmental causes of cancer and may explain cancer clusters. If a single case of cancer could be identified by its triggering carcinogen this could have dramatic impact on environmental regulations.

Discovering patterns of formerly unknown mutagenic behavior is another huge potential benefit. This could lead to identifying formerly unknown carcinogens and linking them to types of cancer. Finding carcinogens by monitoring mutations in DNA instead of looking at population surveys could completely change research into environmental causes of cancer and may explain cancer clusters. If a single case of cancer could be identified by its triggering carcinogen this could have dramatic impact on environmental regulations.

This was not training an algorithm to tell us what we already know, it was developing an algorithm to filter out the noise.

You are, of course, entitled to your opinion. What I see looks like a big regression model with information that may or may not be coded reasonably that gave the investigators an answer they expected. Did it work well in this situation? Maybe. I think calling it a tool is very premature, whether they wrote software for it or not.

They wrote an algorithm that told them what they already knew, you have no idea how much the algorithm is over fitting the data unless you actually test it on a new set of genes (which is probably difficult given that they probably used the data that existed to train the algorithm). Just because the algorithm got these 10 "right" doesn't mean it excluded 98% of false positives, it means that for the data set they tested the regression factors they selected were predictive of this outcome.

Just to re-iterate, my personal scientific opinion is that these pathways are A) highly redundant, and B) highly complex. Without some sort of bayesian network model that is experimentally verified, I just don't think that "Big Data" techniques like Random Forest can solve this type of problem, there is just too much noise and too much spurious correlation.

Just to re-iterate, my personal scientific opinion is that these pathways are A) highly redundant, and B) highly complex. Without some sort of bayesian network model that is experimentally verified, I just don't think that "Big Data" techniques like Random Forest can solve this type of problem, there is just too much noise and too much spurious correlation.

I agree with you that devising a model that basically tells us what we already knew, that those 11 genes are cancer causing while the rest aren't, isn't in itself an accomplishment. However, if this system could perform similar magic on other data sets it would indeed be a great accomplishment.

As for your comments about "too much noise", how on earth would you possibly know that? Have you somehow quantified the noise? Once gene sequencing becomes cheap, and there is no reason to think that it won't, we will potentially be able to sequence billions of healthy genomes and billions of tumours. That's a lot of statistical resolving power. Add to that the work of the type these authors are doing to eliminate false correlations and I don't see why it couldn't work. As for your "spurious correlations", every correlation has a cause. You figure that cause out and account for it in your models and move on. It just takes time and understanding.

This was not training an algorithm to tell us what we already know, it was developing an algorithm to filter out the noise.

You are, of course, entitled to your opinion. What I see looks like a big regression model with information that may or may not be coded reasonably that gave the investigators an answer they expected. Did it work well in this situation? Maybe. I think calling it a tool is very premature, whether they wrote software for it or not.

They wrote an algorithm that told them what they already knew, you have no idea how much the algorithm is over fitting the data unless you actually test it on a new set of genes (which is probably difficult given that they probably used the data that existed to train the algorithm). Just because the algorithm got these 10 "right" doesn't mean it excluded 98% of false positives, it means that for the data set they tested the regression factors they selected were predictive of this outcome.

Just to re-iterate, my personal scientific opinion is that these pathways are A) highly redundant, and B) highly complex. Without some sort of bayesian network model that is experimentally verified, I just don't think that "Big Data" techniques like Random Forest can solve this type of problem, there is just too much noise and too much spurious correlation.

From what I understand, all they did was test the assumption that to find statistically significant changes in frequency you should use a more appropriate baseline, i.e. for cells that are more prone to mutations the genome average will highlight more than what you are looking for. So I don't think overfitting is really a problem. It is not as if they were working towards getting just those ten and the eleventh was their contribution.

There is another whole dimension (maybe two) of difficulty in interpreting these studies. The old, simplistic, understanding of cancer posited it as a monoclonal (identical) expansion of a single cancer cell. This is what people are expecting when they analyze "the cancer genome". But when you realize that cancer cells mutate fairly quickly, and that thousands to millions of generations have passed before the tumor is visible, you realize that it is a rare cancer that behaves as a clone. Leukemias fit this category most. Many solid tumors are the product of rounds of selection and escape, and are a genetic cloud of varied sequences. There is no one single cancer genotype! And many of the mutations have significance for the growth, escape, or altered behavior of the cancer cell even if it does not qualify as "driving" the cancer. Selection has been acting on these cells all along. So one persons noise is another persons surprisingly interesting connection. All this information is badly mashed together by current sequencing approaches that start with a hunk of tumor. You only see the average genotype. We need single cell sequencing to recognize all the variation. And an open mind to interpret the meaning of mutations that have accumulated despite not fitting our preconceptions of purpose.

I don't think the information in these studies was useless. Cancer causing mutations which do not affect the spread of cancer is still information about how these diseases work.

There is no "bad" knowledge when trying to understand how cancers cause genetic mutations imo.This is the beginning of an exciting area of research.And I wouldn't yet rule out that this research could eventually lead to effective treatments.

Thank you on a number of levels. Mr_fnord was modded highly for stating that the algorithm was not modelled on what we already know (and provided no proof to substantiate his claim). Yet you who "challenged an unfounded statement" was modded very lowly. Yet, you persisted in the face of populist ignorance that I continue to see here at Ars.A great quote from your link is"As a doctoral student in a genetics program that is highly-reliant on funding that results from GWA studies, you would think that this type of article would have made a bigger splash than it did. The fact that it was more or less ignored reminds me of Thomas Kuhn and of how so clearly saw that the path to scientific truth is often blocked by entrenched dogmas which are rooted in nothing more-substantial than the personal reputations (and money) of the established scientific community."That quote was backed up by a person who dtated they were a "senior Professor of Genetics".

I'll be modded to hell but the rating system at Ars is reminding me how much like Digg and Reddit we are. People often vote from populist ignorance rather than fact.

I'll give you a tip- people who comment on the rating system in a genetics article are going to get down voted and rightfully so. Enjoy. As for souldonut, his original post rated poorly because it started out with an inflammatory tone. His other more evidence based posts were not down voted. The ratings here actually work well- if you give an evidence based post in a humble non insulting manner, you will do fine in the voting even if your idea is unpopular. Also helps to qualify opinion as opinion.

so, this will help us find chemicals in the short term that causes cancer, and problem sites in the genome. GOOD.A cure for cancer will then in the far away future (20-40+ year span) take the form of a gene therapy that fix these problem sites and improve the bodies natural repair function.

What bugs me is that this will not lead to something that heals cancer patients in the short turn. random mutations with lots of causes in a large pool of gens leads to massive amount of false positives, long R&D times, a lot of dead ends . You basically need to map out all possible mutation pathways and the probability that they take it, and how to change it and/or block it , for every mutation/cell division. Good thing to do and the best long term solution to expanding humanities knowledge base.

However we do know that cancer cell have 2 important properties , they change the genes that controls the replication of the cell (cell division) , the uncontrolled replication causes greater amount of mutation and destroys the normal function of that cell. 2 they causes the cell to do something that is bad for the body (usually the differs between benign tumors and cancer tumors ).

This leads to the obvious question : Why not create a virus that attack cells and if the cell has all the normal replication/cell division gens then do nothing, if not kill. we know of a lot of these gens, we can test the normal healthy cells in a patient , then make a virus for the 480 or so genes involved or a subset, and release it in the patient ?

Or to put it a little bit simplified : why not make a virus detect healthy cells instead of mutated cell and destroy all the rest? is this not a simpler approach then trying to figure out all the millions of mutation possibilities ?

Thank you on a number of levels. Mr_fnord was modded highly for stating that the algorithm was not modelled on what we already know (and provided no proof to substantiate his claim). Yet you who "challenged an unfounded statement" was modded very lowly. Yet, you persisted in the face of populist ignorance that I continue to see here at Ars.

"The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour–normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer."

The f1000 blog post basically says the same thing as this article, that almost all of the hundreds of 'cancer genes' that have been found to have a statistical link to cancer are most likely spurious. How does that contradict research that seeks to separate causal and spurious links?

This is effectively a non-result, but those can be just as valuable as "real results" if they prevent a lot of wasted resources. On the other hand, cheap sequencing combined with the authors' filter just might pop out an interesting gene or two in time.

I think it is better than a non-result, because the end results were NOT expected: that the new filtering techniques were apparently extremely good at filtering out irrelevant mutations, and that apparently they hadn't been missing a huge number of important mutations before, which is important information. Ending up with just 11 genes, each of which was already known to be truly involved in cancer or in preventing it, is quite a surprising result, and "surprisal" is a good measure of information content.

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell. It seems strange that they can increase the rate of mutation and thrive rather than die out.

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell. It seems strange that they can increase the rate of mutation and thrive rather than die out.

I was thinking the same thing. I think maybe the answer is that mutations survive much better inside a functioning body, which is a lot more friendly environment than the cold hard world outside, and somatic mutations probably are much less lethal in the long run because they aren't used to construct an entire new organism (not a reproductive mutation).

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell.

Most mutations are neutral, or have small effects. Also, cancer turns off mechanisms that kill metabolically weird cells, so a mutation that is fatal in a cell with intact apoptotic mechanisms may be viable in a cancer cell.

They might evolve toward higher mutation rates in some areas of the genome (some bacteria seem to react to stress that way) but I have no idea if there's any evidence for that.

Quote:

It seems strange that they can increase the rate of mutation and thrive rather than die out.

Cancer cells also generally turn off apoptotic mechanisms.

Assuming effectively immortal cancer progenitor cells, they can spit out as many non-viable cells as they like as long as some of these turn out to be viable.

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell. It seems strange that they can increase the rate of mutation and thrive rather than die out.

To follow on to S. Carton's response, cancer cells also suddenly have tons of expendable genetic material whose purpose was to coordinate the originating cell type's cooperation with other cells and tissues to keep you alive. And in fact many cells do mutate enough to die off in tumors--it's just that there are far more actively dividing that can take their place quickly.

so, this will help us find chemicals in the short term that causes cancer, and problem sites in the genome. GOOD.A cure for cancer will then in the far away future (20-40+ year span) take the form of a gene therapy that fix these problem sites and improve the bodies natural repair function.

What bugs me is that this will not lead to something that heals cancer patients in the short turn. random mutations with lots of causes in a large pool of gens leads to massive amount of false positives, long R&D times, a lot of dead ends . You basically need to map out all possible mutation pathways and the probability that they take it, and how to change it and/or block it , for every mutation/cell division. Good thing to do and the best long term solution to expanding humanities knowledge base.

However we do know that cancer cell have 2 important properties , they change the genes that controls the replication of the cell (cell division) , the uncontrolled replication causes greater amount of mutation and destroys the normal function of that cell. 2 they causes the cell to do something that is bad for the body (usually the differs between benign tumors and cancer tumors ).

This leads to the obvious question : Why not create a virus that attack cells and if the cell has all the normal replication/cell division gens then do nothing, if not kill. we know of a lot of these gens, we can test the normal healthy cells in a patient , then make a virus for the 480 or so genes involved or a subset, and release it in the patient ?

Or to put it a little bit simplified : why not make a virus detect healthy cells instead of mutated cell and destroy all the rest? is this not a simpler approach then trying to figure out all the millions of mutation possibilities ?

can someone more knowledgeable tell me why?

Edit: some spelling mistakes =)

In short: Biology is complicated.

We are not at the point where we can design proteins that bind to what we want, let alone coupling that binding to an activity of some sort. The R&D necessary to do this would be at least as long, if not longer, than the study you are currently concerned won't have any short-term benefit.

Cancer is a diverse and complex set of diseases. There are no simple short-term solutions.

This is effectively a non-result, but those can be just as valuable as "real results" if they prevent a lot of wasted resources. On the other hand, cheap sequencing combined with the authors' filter just might pop out an interesting gene or two in time.

I think it is better than a non-result, because the end results were NOT expected: that the new filtering techniques were apparently extremely good at filtering out irrelevant mutations, and that apparently they hadn't been missing a huge number of important mutations before, which is important information. Ending up with just 11 genes, each of which was already known to be truly involved in cancer or in preventing it, is quite a surprising result, and "surprisal" is a good measure of information content.

I was trying not to use non-result in a positive way, and sometimes the non-result can be very surprising (I recall a study that said cholesterol-lowering drugs seem to lower heart attack risk even if the person doesn't have high cholesterol). The problem with non-results is not that they contain any less information, but that there is a publication bias against them. We need to be more welcoming of negative results.

In my experience it's often more interesting to use the sequencing 'spin-off' technologies like ChIP-seq, RNA-seq, and DNAse-seq to get a more directed, functional insight into the mechanisms of cancer.Although it's of course still pretty cool to use these genome wide association type studies and see an actual relevant new target pop up

In the context of discovery in well-defined model systems these techniques are indeed very interesting. Unfortunately they are also prone to technical issues which influence the results (e.g. Quality of antibodies, batch effects introduced by different processing labs, initial quality of tissue samples etc). In large scale sequencing projects this can have a huge effect which will make the analysis of the data very hard. DNA sequencing is comparatively straight forward and less sensitive to these confounders.

That's one reason why we don't (yet) see eg RNA-Seq used in clinical studies, as microarrays are simply much more standardized and the analysis better established.

John,This is a great explanation for the layman of perhaps the most significant paper yet from the cancer genome project. Keep up the good work!

Will these scientists' honest investigations about the spuriousness of most of their own prior findings lead to the defunding of their project? Fortunately, such talented scientists should have little problem redirecting their efforts toward more productive lines of inquiry.

It seems to me that - even though the initial results may not have provided definitive value in terms of addressing the original question - the data itself could still be very valuable. Genes that cause cancer are one thing, but there are other mutations of relevance as well that might not be related to cell replication, angiogenesis or metastasis.

For instance, some cancer cells mutate "cell pumps" that allow them to pump chemotherapy out of the cell and thus make them resistant to treatment. Identification of these mutations could lead to treatments that suppress that protection.

Immune response - although mentioned - is another one that could stand much more investigation.

On top of all of this, it should be remembered that oncology in its current state (gene investigation etc.) is still in relative infancy. It would be mad to assume that we know all of the vectors available to cancer, and thus this data - considered from a "pure research" point of view - could yield results that we don't even know to look for.

As I recall from school, most mutations are deadly - that is the decendants do not survive. This may be old information or a gross generalization. But it would be interesting to investigate if the cancer cells are somehow only performing a limited type of mutations to ensure the survival of the cancer cell.

Most mutations are neutral, or have small effects. Also, cancer turns off mechanisms that kill metabolically weird cells, so a mutation that is fatal in a cell with intact apoptotic mechanisms may be viable in a cancer cell.

They might evolve toward higher mutation rates in some areas of the genome (some bacteria seem to react to stress that way) but I have no idea if there's any evidence for that.

Quote:

It seems strange that they can increase the rate of mutation and thrive rather than die out.

Cancer cells also generally turn off apoptotic mechanisms.

Assuming effectively immortal cancer progenitor cells, they can spit out as many non-viable cells as they like as long as some of these turn out to be viable.

Only if you consider cancer to be a single disease, when in fact it is a very long list of diseases.

Some cancers are very close to being cured. Well differentiated papillary and follicular carcinoma of the thyroid in patients under 40 has close to a 100% cure rate through surgery and use of radioactive iodine. Anaplastic thyroid carcinoma - the undifferentiated progression of this cancer - is always considered stage 4, and usually kills within 3 to 6 months of diagnosis.

Basal cell carcinoma - though a cancer - is probably less lethal than influenza. Melanoma when small and confined to the upper layers of the skin has a good prognosis while metastatic melanoma is almost prognosis nil.

Lung cancer and symptomatic pancreatic cancer are almost death sentences, though recent developments from stars like Jack Andraka on pancreatic cancer detection could see an extraordinary about turn.

Hopefully this work will also limit the amount of chasing down blind alleys, looking at genes/mutations that aren't, at the end of the day, likely to be viable therapeutic targets.

A lot of research has gone into targeting general genes, which were discovered thanks to people in cancer biology. The problem isn't an issue of finding genes common across cancer subtypes. Right now, it is a matter of getting a drug on market which works well, and is non-toxic. The FDA isn't sure about some of the EGFR inhibitors, mostly because they don't seem to show any long-term effects in increasing a patient's life.

Also, some genes aren't a viable target unless a vehicle is used to deliver a plasmid into cancer cells to reconstitute the protein to normal levels, eg p53, a tumor suppressor. In most cases, therapeutics are aimed at suppressing the endogenous activation of certain proteins in cancer cells.

I've mentioned the PTEN/PI3K relationship previously, where PTEN is a negative regulator of PI3K. PI3K is a proto-oncogene, which can regulate proliferation and metastasis. In a normal cell, PI3K is not active all the time. In a cancer cell, it is highly active. It's difficult to cause cells to produce more PTEN, so the target for research is to reduce p-PI3K levels. This can be done via enzyme inhibition, or reduce the protein suppressing PTEN expression.

Even then, this isn't a cure. All this does is slow down the cancer progression. Same reason why the FDA isn't high on EGFR inhibitors - NSCLC have a history of mutating against drugs like gefitinib and erlotinib in a couple of years after initial treatment.

Another bias in mutation rate is that essential genes will likely have less mutations (or cancer cell would die), while less essential genes (like odor receptors!) will have an apparent higher rate of mutation. This bias is likely taken into account in the new analysis tool. I just thought it's a very straightforward bias to understand and was worth mentioning.