Lead author of study Dr. Matt Clark (left) and Michael Giolai, post-graduate student in the Clark Group, by the PacBio Sequel (next-generation sequencing platform used for study) at EI.

RenSeq (1) is the method to sequence Resistance (R) genes that confer disease resistance in plants.

Each plant typically carries hundreds of potential R gene sequences, encoding NB-LRR proteins, identified by the presence of specific sequence motifs. R genes are often part of families of closely related sequences.

While shared sequences make it possible to capture the R-genes, it also makes it hard to tell them apart and find the exact gene that enables plants to survive attack. Longer molecules and sequences of DNA allow easier and more accurate genetic analysis to identify variation.

The NB-LRR gene family enables plants to withstand infection from a suite of diseases and form a second line of defence. After a pathogen has managed to invade a plant, it uses 'effector' molecules to weaken a plant's defences -- the R gene proteins recognise these 'effector' molecules and signal to the plant to activate defence responses -- killing cells around the site of infection in an attempt to stop it spreading.

This constant evolutionary arms race between plants and pathogens, whereby the organisms causing disease in plants are mutating to avoid plant defences, causes the plants to evolve through changes in their own genetic makeup. This is where a huge variety of R genes come into play that are highly similar in structure and DNA sequence.

Researchers at the Earlham Institute (EI), The Sainsbury Laboratory (TSL) and the James Hutton Institute, have found a new way to decipher these large stretches of DNA to discover and annotate pathogen resistance in plants.

Using the PacBio, which can read longer stretches of DNA in their entirety, along with the developed NB-LRR gene workflow 'RenSeq' (Resistance gene enrichment sequencing), the data not only targets R genes, but also the important regulatory regions of DNA -- promoters and terminators that signal when to start making a protein and when to stop.

Dr Matt Clark, Head of Technology Development at EI and lead author of the study, said: "Wild relatives of crops contain a huge repertoire of novel genes that could be used to breed more resistant varieties that need less pesticide treatments. When it comes to identifying key genes it can be very difficult for researchers to find the exact resistance gene due to the sheer similarity of their DNA sequences.

"Typical sequencing methods use short reads eg from the Illumina HiSeq, but these are often too short to prise similar genes apart.

"RenSeq diverges from normal DNA sequencing on the PacBio by focussing the sequencing effort on a specific gene family i.e. R-genes. In this study, by optimising multiple steps in the library construction, we can identify the protein-coding sequences and the neighbouring regulatory regions; indeed in many cases we can reconstruct the entire DNA region even if it contains many similar genes which normally are too hard to tell apart. This means we can identify the exact gene that confers resistance to a certain infection, and used in breeding programmes."

Dr Ingo Hein, Principal Investigator at the James Hutton Institute and co-author, added: "R genes can control diverse plant diseases including major threats to global crop production. The ability to capture and sequence long genomic DNA fragments that contain full-length R genes enables the rapid identification of novel, functional resistance genes from wild species. These genes, if introgressed into new cultivars via breeding or alternative routes, could significantly reduce the dependency on pesticides for crop production."

The paper, "Targeted capture and sequencing of gene sized DNA molecules" is published in BioTechniques.