IDs:
MA825 – Sope_d
MA826 – Sope_r
MA968 – Ardu1_d
MA969 – Ardu2
MA971 – Kunila1
MA973 – Kunila2
MA974 – Kudruküla2
MA975 – Kudruküla3
MA976 – Ardu1_r
DNA sequencing:
DNA was sequenced using the Illumina HiSeq 2500 platform with the 100 bp single-end method. All samples were first sequenced together on one lane, one sample (Sope_r) became part of a Bronze Age project The Rise (Allentoft et al. 2015) and was therefore sequenced many times, and 7 other samples from different individuals with endogenous DNA content over 1% were sequenced further on 8 lanes.
Mapping:
Before mapping, all sequence files from one sample were merged, the sequences of adaptors and indexes were cut from the ends of DNA sequences using Trimmomatic 0.32 with the option ILLUMINACLIP. Sequences shorter than 30 bp were also removed with the option MINLEN to avoid random mapping of sequences from other species.
The sequences were mapped to reference sequence GRCh37 using Burrows-Wheeler Aligner (BWA) and command aln.
Additional steps:
After mapping, the sequences were prepared for analyses by first converting them to SAM format with BWA command samse. Then the sequences were converted to BAM format, sequences that mapped to the reference sequence were sorted out, and PCR duplicates were removed, all of which was done with samtools 0.1.19.