Khmer

Allows k-mer-based dataset analysis and transformations. Khmer is a set of command-line tools for working with DNA shotgun sequencing (SGS) data from genomes, transcriptomes, metagenomes, and single cells. The software relies on a probabilistic data structure, a Count-Min Sketch, which permits online updating and retrieval of k-mer counts in memory which is necessary to support online k-mer analysis algorithms.

Khmer citations

(6)

library_books

Walking the Talk: Adopting and Adapting Sustainable Scientific Software Development processes in a Small Biology Lab

2016

PMCID: 5142744

PMID: 27942385

DOI: 10.5334/jors.35

[…] f scientific problems being studied with sequencing is driving the rapid development of many new tools, both for handling data on large scales and to address new and different biological problems.The khmer software was born from a need to more scalably analyze short fixed-length (20–30 character) words, or “k-mers”, in large DNA sequencing data sets. The use of k-mers in DNA sequence analysis is c […]

library_books

Conservation and diversification of the transcriptomes of adult Paragonimus westermani and P. skrjabini

2016

PMCID: 5020434

PMID: 27619014

DOI: 10.1186/s13071-016-1785-x

[…] , bacteria [], Homo sapiens (GenBank version hs37) and Canis familiaris (GenBank version 3.1). Remaining high-quality, contaminant-free read sets were down-sampled by digital read normalization using khmer (k = 20) []. Reads selected in the down-sampling and their mates were assembled using the Trinity de novo RNA-Seq assembler using default parameters []. Scripts included in the Trinity software […]

Resolving the Complexity of Human Skin Metagenomes Using Single Molecule Sequencing

2016

MBio

PMCID: 4752602

PMID: 26861018

DOI: 10.1128/mBio.01948-15

[…] . Sequencing depth was evaluated using k-mer accumulation. SMRT reads were split into 100-bp fragments using pyfasta 0.5.2. Reads were then split into k-mers, compared to a k-mer coverage table using khmer v0.7.1 (), and kept only if the median k-mer coverage was below a 5× cutoff. The resulting curves estimate the coverage of k-mer space as a function of sequencing effort.Matched 16S rRNA and ITS […]

library_books

Whole Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

2014

Genome Announc

PMCID: 4132622

PMID: 25125646

DOI: 10.1128/genomeA.00802-14

[…] reads were trimmed for quality, and the Illumina adapters were removed using Trimmomatic (). Fragments with both surviving read pairs were then digitally normalized using the recommended protocol in khmer (). The long mate-pair reads were trimmed using NextClip (), and fragments with the junction adapter in at least one of the paired reads were used in the assembly. The small- and long-insert lib […]

[…] In 2011 in Cambodia, environmental samples were collected from 4 LPMs each week for 7 weeks, including during the Khmer New Year festival (). Two of the markets were in Phnom Penh, the capital city: Orussey market (M1) and Chamkar Doung market (M2), which also served as an overnight resting place and a place to k […]

Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA; Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA; Department of Plant, Soil, and Microbial Sciences, Michigan State University, East Lansing, MI, USA

Khmer funding source(s)

Supported in part by the United States Department of Agriculture under grant 2009-03296 from the National Institute of Food and Agriculture, the National Science Foundation grant 09-23812, the National Institutes of Health grant 1R01HG007513-01 and NSF Postdoctoral Fellowship Award #0905961.