Next Generation Sequencing (NGS) data is knocking at our door and simultaneously, our ability to design novel enzymes (rational design or directed evolution) using high throughput methods has improved tremendously. As a result, the demand to link enzymatic sequences to their chemical products and metabolic pathways is ever increasing. On the other hand, the push to generate Metabolomics data to design Biomarkers, understand Toxicity, Functional genomics and Nutrigenomics has given researchers a run for their money!

Last year we launched EC-Blast (my old post), a robust tool to compare chemical reactions using chemical knowledge of bond changes, molecule molecule pair (MMP) and molecule substructures. This tool helps plough through and understand the reactions present in the Enzyme Commission (E.C.) classification. This has generated a lot of interest in the research community and industry to revisit and mine the knowledge which might have been overlooked by traditional methods. Feedbacks from our users strongly suggested a demand for tools/methods to systematically link the protein sequences to the knowledge of bond changes, molecule molecule pair (MMP) and molecules substructures.

We have recently developed Sequence to Enzyme (Seq2EC) a novel tool (Figure 1) to:

Enzymes have been part of our evolutionary machinery and it’s importance is ever increasing in our life. An enzymatic hierarchal functional classification has been developed to cluster similar enzymes based on its chemistry (kindly refer to my previous blog on enzymes). A parallel system envolves sequence and protein structural based classification systems. One of the most challenging issues in todays bio/chemo informatics science is to automatically link the sequence knowledge with the enzymatic chemistry. There exists many methods in the literature addressing this issue but its hard to find a direct link which can hold true for all the cases. Although, very recently in the Prof. Janet Thornton’s group we have come up with a web tool – “FunTree” for linking enzyme super families based on the knowledge of the evolution, derived from sequences and structures (proteins and small molecules). It’s very enigmatic to find a one to one mapping between genes->protein->enzymes and its equally mind boggling to navigate in this space. This is one of the reasons why we have many orphan enzymes or enzyme which do not have a sequence assigned to it yet. On one hand we have ever increasing sequence database and sophisticated tools like BLAST and FASTA to compare them. Unfortunately, the bio-chemical side of the story is slow as we have limited number of publicly available chemical databases and tools in chemistry. Although in the recent years there has been databases like BRENDA, KEGG, BioCyc, UniProt, EC->PDB and SwissProtetc. to bring forth and link sequence to chemistry. There are efforts to link up various resources of enzyme chemistry under an umbrella and one such web portal is “Enzyme Portal“. Likewise there exists, few curated databases linking enzyme function and reaction mechanism like MACiE , Rhea and SFLDetc.

The challenge for a biologist/chemist is find a tool which can function like BLAST (as a magic black box) in finding similar enzymes in a reaction database (needle in a haystack). The good new is that we have made some progress in this interesting area of research by coming up with a novel tool – “EC-BLAST“. The core idea behind this tool is to find similar enzymes ranked by similarity of the bond changes, reaction center or chemical structural similarity of the participating reactions. One could start a search with a molecule/reaction name or its structure. The Atom-Atom Mapping (AAM) is algorithmically generated on the fly for a balanced input reaction and the bond changes are automatically deduced and marked before performing any search.

EC BLAST front page

The cognisance of search results would channelise us to gain better insight into the catalytic promiscuity of the enzymes and complement the sequence based results obtained from tools like BLAST, FASTA etc (where the chemistry in not necessarily retained in the results). This will help us to link up the evolutionary and mechanistic aspects of the enzymes, in the biological findings with chemical knowledge.

Such tools will also help us gain better insight into toxicity studies (can be a value added parmeter to the likes of ChEMBL/DrugBank), in designing novel enzyme and retrosynthetic pathways etc. Although the first glimpse of the EC-BLAST was unveiled at the ISMB 2011, Vienna where it won the “Killer Apps 2011” award, it largely remained restricted to the EBI and collaborators. The response at the ISMB 2011 (poster here) was very encouraging for us and there has been an ever increasing need, scope and requisition for such a resource. Hence, we have now decided to go public with a beta version of our web portal service.

EC-Blast result page for bond change similarity searches.

Note: If you are interested in testing this service or sending us your comments or feedbacks, please do let me know!