Update on annotation of Pseudomonas protegens: confirmations and surprises #biocuration

After 2 weeks of annotation, some of our students presented their preliminary results to the class. Here is a summary.

Phosphate and nitrate uptake and metabolism: using KEGG and keywords, the students identified genes to annotate; of these, 45 have already been confirmed. Some surprising results, such as a plant-specific nitrilase, which is also found in a Rhizobium (UniProtKB/Swiss-Prot) and other Pseudomonas (but not well annotated there). They also identified a Pseudomonas-specific duplication of PhoA2. And for the rest, a lot of annotation is confirmatory.

Genomic islands: checking base composition with 5 different tools: 3 in IslandViewer, 1 in INDiGenIUS and 1 in SigHunt. Each method finds some different regions, but the students decided to focus on the most supported by several methods. They did not consider regions which on first examination appeared to come from phages, as another student group is annotating these. The putative regions were checked by BLAST: indeed no hit with Pseudomonas, and integrase genes as expected.

IslandViewer graphical result

Insect killing (coolest topic title): Starting from the observation of an insect toxin cluster Fit in Pseudomonas fluorescens (Péchy-Tarr et al 2008), they found by BLAST, first an unannotated gene, and then a potential cluster around it, with a very similar structure to the known Fit. Sequence homology confirms that this is probably an insect toxin cluster in Pseudomonas protegens, and the genes were annotated as Fit-A to Fit-H. Neighboring genes are not conserved with the 2008 publication of Fit in P. fluorescens PF05, and a synteny analysis shows actually an inversion in this region between our strain and PF05.

Dotplot of synteny between P. fluorescens and protegens; in the center, the inversion around the Fit cluster

Chemotaxis: The students found well characterized genes from KEGG, spread among several clusters of putative chemotaxis genes. This topic appears difficult, because domains may lead to “chemotaxis” automatic annotations, whereas the genes have different functions. They also found many chemotaxis transductors. During the presentation, the group annotating motility was able to contribute also genes related to chemotaxis which they also identified. The two groups now know that they need to coordinate their work. A lot of work left on a tough annotation topic…