Building pysamstats depends on [numpy]http://www.numpy.org/) please installthat first. Then try:

```$ pip install --upgrade pysam pysamstats```

N.B., pysamstats also depends on [pysam]http://code.google.com/p/pysam/which needs to be installed before attempting to install pysamstats. Thecommand above should do it, but if you have any problems, try installingpysam separately first. If you have problems installing pysam, email the[pysam user group]https://groups.google.com/forum/#!forum/pysam-user-group)

The suffix **_fwd** means the field is restricted to reads mapped tothe forward strand, and **_rev** means the field is restricted toreads mapped to the reverse strand. E.g., **reads_fwd** means thenumber of reads mapped to the forward strand.

The suffix **_pp** means the field is restricted to reads flagged asproperly paired.

* **chrom** - Chromosome name.

* **pos** - Position within chromosome. One-based by default when using the command line, zero-based by default when using the python API.

* **reads_all** - Number of reads aligned at the position. N.b., this is really the total, i.e., includes reads where the mate is unmapped or otherwise not properly paired.

* **reads_pp** - Number of reads flagged as properly paired by the aligner.

* **reads_mate_unmapped** - Number of reads where the mate is unmapped.

* **reads_mate_other_chr** - Number of reads where the mate is mapped to another chromosome.

* **reads_mate_same_strand** - Number of reads where the mate is mapped to the same strand.

* **reads_faceaway** - Number of reads where the read and its mate are mapped facing away from each other.

* **reads_softclipped** - Number of reads where there is some softclipping at some point in the read's alignment (not necessarily at this position).

* **reads_duplicate** - Number of reads that are flagged as duplicate.

* **dp_normed_median** - Number of reads divided by the median number of reads over all positions in the specified region, or whole genome if no region specified.

* **dp_normed_mean** - Number of reads divided by the mean number of reads over all positions in the specified region, or whole genome if no region specified.

* **dp_percentile** - Percentile within which the number of reads falls considering all positions in the specified region, or whole genome if no region specified.

* **gc** - Percentage GC content in the reference at this position (depends on window length and offset specified).

* **dp_normed_median_gc** - As *dp_normed_median* but normalised by positions with the same percent GC composition.

* **dp_normed_mean_gc** - As *dp_normed_mean* but normalised by positions with the same percent GC composition.

* **dp_percentile_gc** - As *dp_percentile* but only considering positions with the same percent GC composition.

* **matches** - Number of reads where the aligned base matches the reference.

* **mismatches** - Number of reads where the aligned base does not match the reference (but is not a deletion).

* **deletions** - Number of reads where there is a deletion in the alignment at this position.

* **insertions** - Number of reads where there is an insertion in the alignment at this position.

* **A/C/T/G/N** - Number of reads where the aligned base is an A/C/T/G/N.

* **mean_tlen** - Mean value of outer distance between reads and their mates for paired reads aligned at this position. N.B., leftmost reads in a pair have a positive tlen, rightmost reads have a negative tlen, so if there is no strand bias, this value should be 0.

* **rms_tlen** - Root-mean-square value of outer distance between reads and their mates for paired reads aligned at this position.

* **std_tlen** - Standard deviation of outer distance between reads and their mates for paired reads aligned at this position.