This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Abstract

Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighbouring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved including: what are the sequence motifs that affect point mutations? how large are the motifs? and, do they vary between samples? We present new log-linear models that allow explicit examination of these questions along with sequence logo style visualisation to enable identifying specific motifs. We demonstrate the utility of these methods by analysing human germline and malignant melanoma mutations. We recapitulate the known CpG effect and identify numerous novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbourhood on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations) and applied to the data. We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer including strand asymmetry and markedly different neighbouring influences. The methods reported are made freely available as a Python libraryhttps://bitbucket.org/pycogent3/mutationmotif.

Author Comment

Updated to correct equation typos and now include an analysis where the entire genome is used to produce the reference distribution.

Supplemental Information

Supplementary tables and figures

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Yicheng Zhu conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Teresa M Neeman conceived and designed the experiments, wrote the paper, reviewed drafts of the paper.

Von Bing Yap conceived and designed the experiments, wrote the paper, reviewed drafts of the paper.

Gavin A Huttley conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Funding

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Follow this preprint for updates

"Following" is like subscribing to any updates related to a preprint.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple preprints then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this preprint
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.