On Tue, 1 Aug 2000 hilmar.lapp@pharma.novartis.com wrote:
>>>>> > The Ensembl Genscan parser Ewan sent yesterday seems to be a good
> starting
> > point. However, I'd prefer to have a gene structure represented
> optionally
> > independent of the/an underlying sequence (object), that is, as a feature
> > which may or may not have a sequence attached. In addition, a parser
> should
> > not need to rely on being provided with the source sequence, and the
> > resulting gene structure representation can be attached to the pertaining
> > source sequence by the client.
> >
> > I'd propose the following:
> > Bio::SeqFeature::GeneStructure is-a Bio::SeqFeature::Generic (or just a
> > Bio::SeqFeatureI ?)
> > and offers specific support for gene structure related things, like
> [...]
>> Aha. Now you want the appropiate Ensembl gene objects, not the genscan
> parser. Look at
>> Bio::EnsEMBL::Gene
> ::Transcript
> ::Translation
>> Look at
>>http://www.ensembl.org/Docs/Pdoc/ensembl/modules/Bio/EnsEMBL/modules.html>>> Again, I would be happy if these moved "across" to bioperl.
>> you will want to add additional stuff to the Gene object to handle
> promoters (or perhaps the transcript object). Don't forget about
> alternative splicing.
>> Well, that's not really what I was aiming at. I thought about a
> representation of the _data_ which make up a gene structure, as, e.g.
> people find it or programs predict it. IMHO all that _interpretation_
> of the data (features in this case) belongs to separate classes,
This is a fine distinction to make; I certainly would not object to your
making these feature classes, but if they are "just" to do with gene
prediction - or - even more specifically - just with genscan, put them in
a namespace that indicates this, ie
Bio::SeqFeature::Gene # bad
# good namespaces.
Bio::Tools::GenePrediction::Gene
::Promotor
or
Bio::Tools::Genscan::Gene
> either derived ones, or within another hierarchy (you could think of a
> GeneTranscriber who knows about alternative splicing). So, the modules
> I proposed shouldn't do much with actual sequences apart from maybe
> very basic things. They're just features, which in the first place is
> all you need to represent e.g. GenScan results. And they should be
> rich enough to allow other modules to make real stuff like protein
> sequences out of it. So, lightweight, but heavy enough.
I worry about us rewriting things (eg, exons) but I do feel confident that
splitting out Genscan "output" from Genscan "interpretation" is a good
thing.
go for it.
>> Am I missing something?
>> Hilmar
>>> _______________________________________________
> Bioperl-l mailing list
>Bioperl-l@bioperl.org>http://bioperl.org/mailman/listinfo/bioperl-l>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------