I'm excited to announce that SEQanswers will act as the official discussion host for the Open Genomics Engine project. See the official announcement below:

Quote:

The Open Genomics Engine (OpenGE) was developed at Virginia Bioinformatics Institute and Virginia Tech and funded by the NVIDIA Foundation. OpenGE is designed for analyzing and interpreting high-throughput sequencing data. It is freely available to the research community to enable other investigators to build upon the tools and collaboratively advance the fields of genomics and cancer biology.

David Mittelman, associate professor at the Virginia Bioinformatics Institute and the Department of Biological Sciences at Virginia Tech, will be the project leader for OpenGE. His laboratory will maintain the official OpenGE distribution, as well as ongoing research and development for the project. The bioinformatics lead will be Gareth Highnam and the software development lead will be Lee Baker, both from VBI. Contact Dr. Mittelman and his team to learn more about the effort, contribute, or provide feedback on the project.

The official OpenGE discussion community is hosted by Eric Olivares at SEQanswers. SEQanswers is the largest online genomics community and is a free and open knowledge-sharing resource for interdisciplinary discussion covering experimental and computational aspects of sequencing and sequencing analysis. SEQanswers has more than 4,000 active members, including developers of many popular genome analysis tools.

OpenGE is the first project to be funded under Compute the Cure, an initiative of the NVIDIA Foundation to help accelerate cancer researcher in the search for a cure.

Certainly one aspect of OpenGE will be the management of analysis pipelines composed of existing tools. Much like the recently released bpipe project, the aim would be to incorporate existing tools into workflows. We want to eventually build a web-based GUI (in development right now) that allows for the dynamic creation of workflows (currently this is implemented at the command line) and we would like it to feature things like: versioning of tools, progress tracking, recovery and restarting from errors, etc.

Rewriting everyone's code is inefficient and unnecessary, but OpenGE also will feature modifications of existing code from projects like bamtools. We have implemented a multithreaded BAM compression method, a multithread combined merge-sort, and some other low-level tools for operating on BAM files. I know Heng Li has begun introducing some of this into samtools and I believe Nils Homer may be working on this as well. I would love to see how our code stacks up since a secondary goal of the project is to accelerate the analysis of genomes.

One nice thing about the project is that we will have funding to continue development of the GUI as well as optimization and addition of other tools to the software. I am hoping that via SEQanswers we can get direction from the community on prioritizing what to do next. Furthermore we hope to write adapters/plugins for existing tools and to encourage others to join in the development, etc.

The project is completely open source and will be hosted on GitHub. We are hoping that the combination of fast code and a easy to use package manager/workflow GUI will put all this analysis in the hands everyone...

Sorry for the long post, but hopefully that sort answers your question?

The main code is based on bamtools and so for those that are familiar with samtools or bamtools, it should be easy to start using right away. Please feel free to submit comments/suggestions via the github site (under Issues) or discuss here in this forum. We would be grateful to hear of bugs or simply feedback on speed increases with different input data.

Certainly one aspect of OpenGE will be the management of analysis pipelines composed of existing tools. Much like the recently released bpipe project, the aim would be to incorporate existing tools into workflows. We want to eventually build a web-based GUI (in development right now) that allows for the dynamic creation of workflows (currently this is implemented at the command line) and we would like it to feature things like: versioning of tools, progress tracking, recovery and restarting from errors, etc.

Rewriting everyone's code is inefficient and unnecessary, but OpenGE also will feature modifications of existing code from projects like bamtools. We have implemented a multithreaded BAM compression method, a multithread combined merge-sort, and some other low-level tools for operating on BAM files. I know Heng Li has begun introducing some of this into samtools and I believe Nils Homer may be working on this as well. I would love to see how our code stacks up since a secondary goal of the project is to accelerate the analysis of genomes.

One nice thing about the project is that we will have funding to continue development of the GUI as well as optimization and addition of other tools to the software. I am hoping that via SEQanswers we can get direction from the community on prioritizing what to do next. Furthermore we hope to write adapters/plugins for existing tools and to encourage others to join in the development, etc.

The project is completely open source and will be hosted on GitHub. We are hoping that the combination of fast code and a easy to use package manager/workflow GUI will put all this analysis in the hands everyone...

Sorry for the long post, but hopefully that sort answers your question?

This seems very exciting and promising, but I hope that you can tell me how this project is different from Galaxy. And it seems to have a significant advantage in terms of release time, users, development, etc.

This seems very exciting and promising, but I hope that you can tell me how this project is different from Galaxy. And it seems to have a significant advantage in terms of release time, users, development, etc.

Tx,
Bob

It has been a year since I posted that message and since then we decided against developing a GUI. We have functionality that allows you to plug-in external tools into OpenGE (bpipe style) and this is great for simple command-line based workflows. It is definitely not intended to be a replacement for Galaxy, which is one the best workflow managers around.

The focus of OpenGE was to speed up and optimize certain steps of the genome analysis pipeline. I think we achieved that. The code is not 100% bug-free but it has been a continuous work in progress

It has been a year since I posted that message and since then we decided against developing a GUI. We have functionality that allows you to plug-in external tools into OpenGE (bpipe style) and this is great for simple command-line based workflows. It is definitely not intended to be a replacement for Galaxy, which is one the best workflow managers around.

The focus of OpenGE was to speed up and optimize certain steps of the genome analysis pipeline. I think we achieved that. The code is not 100% bug-free but it has been a continuous work in progress

I'm excited to announce that SEQanswers will act as the official discussion host for the Open Genomics Engine project. See the official announcement below:

As always, feedback and input are welcome.

-=Eric

in what language are the programs written, what compiler is needed,
what operation system, do the programs work from command line,
can they be joined in batch files, is there a simple example of some
such program, and the typical size of a data-file which it handles