sqlite datatype

sqlite datatype

Hi JJ and mailing list,

I think it was John who mailed earlier about implementing an SQLite datatype for proteomics. I sort of like this, since I find myself using SQLite all the time for my own command line tool to do filtering, protein grouping, and relating quant output to ID data.
https://github.com/glormph/msstitch — I basically wrote this to combine percolator, mzidentml, and openMS intermediate formats. It’s currently a bit of a mixture of different databases and tables for different operations, but maybe it would be a good idea to have one large proteomics experiment SQLite database. Note that this is nothing I want to push onto people like it would be the next PSI standard, more like a very convenient and fast intermediate datatype.

I believe that the proteome discoverer msf files are also based on sqlite. While I’m not sure if it’s a good fit for a server environment since it can get locking/access problems when accessed by too many processes, it’s a very nice fit for a bioinformatic cli tool. Are you guys currently working on any standardized SQLite thing?

_______________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

Re: sqlite datatype

On 2/12/15 1:50 AM, Jorrit Boekel wrote:

> Hi JJ and mailing list,
>
> I think it was John who mailed earlier about implementing an SQLite datatype for proteomics. I sort of like this, since I find myself using SQLite all the time for my own command line tool to do filtering, protein grouping, and relating quant output to ID data.
> https://github.com/glormph/msstitch — I basically wrote this to combine percolator, mzidentml, and openMS intermediate formats. It’s currently a bit of a mixture of different databases and tables for different operations, but maybe it would be a good idea to have one large proteomics experiment SQLite database. Note that this is nothing I want to push onto people like it would be the next PSI standard, more like a very convenient and fast intermediate datatype.
>
> I believe that the proteome discoverer msf files are also based on sqlite. While I’m not sure if it’s a good fit for a server environment since it can get locking/access problems when accessed by too many processes, it’s a very nice fit for a bioinformatic cli tool. Are you guys currently working on any standardized SQLite thing?
>
> cheers,
> —
> Jorrit Boekel
> Proteomics systems developer
> BILS / Lehtiö lab
> Scilifelab Stockholm, Sweden
>
>
>

Jorrit,

We would like to define a Galaxy datatype that subclasses the Galaxy SQLite datatype and defines a schema for proteomics.

Our current use case is as a Galaxy dataprovider for a visualization plugin.
Ira and I had discussed this last summer, and he quickly put together demo visualization plugin.

MSI is currently prototyping a Galaxy visualization plugin using the lorikeet spectral viewer.

For feasibility testing, I just used the schema that fell out of the mzR bioconductor package (details below),
but we would now like to design a schema that would be responsive, but also generic enough to cover a variety of needs.

I'll take a look at what you've done for guidance.

Thanks,

JJ

I've been using this data as a test case for the Visualization plugin:

--
James E. Johnson Minnesota Supercomputing Institute University of Minnesota
_______________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/