Abhi/Josh- Please give me a shout if LocatableSeq gives you any throws or if you
get strange coordinates. We're hoping it's well-fixed, but...
thanks-Mark
----- Original Message -----
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: "Joshua Udall" <jaudall at gmail.com>
Cc: "Chris Fields" <cjfields at illinois.edu>; <bioperl-l at lists.open-bio.org>;
"Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
Sent: Wednesday, January 07, 2009 9:50 PM
Subject: Re: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
> Thanks Joshua.
> I will use it and get back to you if we have any questions here.
>> Best,
> -Abhi
>> On Wed, Jan 7, 2009 at 12:57 AM, Joshua Udall <jaudall at gmail.com> wrote:
>>> Done. Let me know if you have any questions. Here's the comments I
>> included with the submission (plus a few additions):
>>>> Attached is code to facilitate ace file IO - particularly of large ace
>> files. The code will read ace contig entries one-at-a-time, instead of all
>> at once in the following manner:
>> $contig = stream->next_contig
>>>> It will write ace files to a text file using:
>> $stream->write_contig($contig)
>>>> General Usage:
>> my $contig_io =
>> Bio::Assembly::ContigIO->new(-file=>$ace_filename,-format=>'ace');
>> while (defined (my $contig = $contig_io->next_contig() ) )
>> {
>> # do something here.
>> }
>>>> The general usage above should be familiar to those using bioperl. It is
>> obviously different than the AssemblyIO which also uses a '->next' stream
>> and an ace.pm file (in the IO dir). I found that very confusing because I
>> haven't often had multiple assemblies that I need to parse and it seems like
>> overkill.
>>>> The main files are ContigIO.pm and the ace.pm in the ContigIO dir. I've
>> attached other files that are in the bundle too. We did this some time ago
>> and though the files have the same author info at the top, we've made a few
>> changes to them.
>>>> A several months ago, I found that the recently discussed LocatableSeq bug
>> was causing problems for me with this code. Not imagining that I could have
>> actually found a bioperl bug myself, I made my own simple workaround by
>> adjusting the 'end' value. If the LocatableSeq bug has been fixed, this
>> module should work fine. I'm simply commenting that it is untested with
>> 1.6.
>>>> I've also attached the files submitted to bugzilla to this message as per
>> Abhichek's request. Good luck.
>>>> Josh
>>>>>>>>>> On Tue, Jan 6, 2009 at 3:22 PM, Chris Fields <cjfields at illinois.edu>wrote:
>>>>> Could you archive the files and attach them to a bug report (you can mark
>>> it as an enhancement request). We can take a look.
>>>>>>http://bugzilla.open-bio.org/>>>>>> chris
>>>>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>> Chris et al. -
>>>>>>>> A student and I have written code to do this - write ace files as well as
>>>> parse them one entry at a time. In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. Our implementation simply reads one contig at a time instead of
>>>> first trying to slurp the whole ace into memory. I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>>>>> Josh
>>>>>>>> On Tue, Jan 6, 2009 at 1:52 PM, Chris Fields <cjfields at illinois.edu>
>>>> wrote:
>>>>>>>> Not at this time (write_assembly is not implemented). If you come up
>>>>> with
>>>>> code to do so let us know (patches are always welcome).
>>>>>>>>>> chris
>>>>>>>>>>>>>>> On Jan 6, 2009, at 2:43 PM, Abhishek Pratap wrote:
>>>>>>>>>> Thanks that helped.
>>>>>>>>>>>>>>>>> Any method to write Ace files ?
>>>>>>>>>>>> Thanks,
>>>>>> -Abhi
>>>>>>>>>>>> On Tue, Jan 6, 2009 at 3:36 PM, Smithies, Russell <
>>>>>>Russell.Smithies at agresearch.co.nz> wrote:
>>>>>>>>>>>> Here's how I've been doing it:
>>>>>>>>>>>>>>>>>>>>>>>>>>> my $infile = "454Contigs.ace";
>>>>>>> my $parser = new Bio::Assembly::IO(-file => $infile ,-format =>
>>>>>>> "ace")
>>>>>>> or
>>>>>>> die $!;
>>>>>>> my $assembly = $parser->next_assembly;
>>>>>>>>>>>>>> # to work with a named contig
>>>>>>> my @wanted_id = ("Contig100");
>>>>>>> my ($contig) = $assembly->select_contigs(@wanted_id) or die $!;
>>>>>>>>>>>>>> #get the consensus
>>>>>>> my $consensus = $contig->get_consensus_sequence();
>>>>>>>>>>>>>> #get the consensus qualities
>>>>>>> my @quality_values = @{$contig->get_consensus_quality()->qual()};
>>>>>>>>>>>>>> hope this helps,
>>>>>>>>>>>>>> Russell
>>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>>>>>bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>>>>>>>> Sent: Tuesday, 6 January 2009 6:43 p.m.
>>>>>>>> To: bioperl-l at lists.open-bio.org>>>>>>>> Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
>>>>>>>>>>>>>>>> Hi All
>>>>>>>>>>>>>>>> I am looking for some code to parse the ACE file format. I have big
>>>>>>>> ACE
>>>>>>>> files which I would like to trim based on the user defined Contig
>>>>>>>> name
>>>>>>>> and
>>>>>>>> specific region and write out the output to another fresh ACE file.
>>>>>>>>>>>>>>>> For now I am trying to tweak Bio::Assembly::IO; but it is kind of
>>>>>>>> slow.
>>>>>>>> Any
>>>>>>>> other alternative or suggestions.
>>>>>>>>>>>>>>>> Thanks All,
>>>>>>>> -Abhi
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>> -----------------------------
>>>>>>>> Abhishek Pratap
>>>>>>>> Bioinformatics Software Engineer
>>>>>>>> Institute for Genome Sciences
>>>>>>>> School of Medicine, Univ of Maryland
>>>>>>>> 801, W. Baltimore Street, Baltimore, MD 21209
>>>>>>>> Ph: (+1)-410-706-2296
>>>>>>>> www.igs.umaryland.edu/
>>>>>>>>>>>>>>>> Chair
>>>>>>>> RSG-Worldwide
>>>>>>>> ISCB-Student Council
>>>>>>>>http://iscbsc.org/rsg>>>>>>>>>>>>>>>> www.bioinfosolutions.com
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>>Bioperl-l at lists.open-bio.org>>>>>>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l>>>>>>>>>>>>>>>> =======================================================================
>>>>>>> Attention: The information contained in this message and/or
>>>>>>> attachments
>>>>>>> from AgResearch Limited is intended only for the persons or entities
>>>>>>> to which it is addressed and may contain confidential and/or
>>>>>>> privileged
>>>>>>> material. Any review, retransmission, dissemination or other use of,
>>>>>>> or
>>>>>>> taking of any action in reliance upon, this information by persons or
>>>>>>> entities other than the intended recipients is prohibited by
>>>>>>> AgResearch
>>>>>>> Limited. If you have received this message in error, please notify the
>>>>>>> sender immediately.
>>>>>>>>>>>>>> =======================================================================
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>> -----------------------------
>>>>>> Abhishek Pratap
>>>>>> Bioinformatics Software Engineer
>>>>>> Institute for Genome Sciences
>>>>>> School of Medicine, Univ of Maryland
>>>>>> 801, W. Baltimore Street, Baltimore, MD 21209
>>>>>> Ph: (+1)-410-706-2296
>>>>>> www.igs.umaryland.edu/
>>>>>>>>>>>> Chair
>>>>>> RSG-Worldwide
>>>>>> ISCB-Student Council
>>>>>>http://iscbsc.org/rsg>>>>>>>>>>>> www.bioinfosolutions.com
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>>Bioperl-l at lists.open-bio.org>>>>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l>>>>>>>>>>>>>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>>Bioperl-l at lists.open-bio.org>>>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l>>>>>>>>>>>>>>>>>>>>>> --
>>>> Joshua Udall
>>>> Assistant Professor
>>>> 295 WIDB
>>>> Plant and Wildlife Science Dept.
>>>> Brigham Young University
>>>> Provo, UT 84602
>>>> 801-422-9307
>>>> Fax: 801-422-0008
>>>> USA
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>>Bioperl-l at lists.open-bio.org>>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l>>>>>>>>>>>>>>>> --
>> Joshua Udall
>> Assistant Professor
>> 295 WIDB
>> Plant and Wildlife Science Dept.
>> Brigham Young University
>> Provo, UT 84602
>> 801-422-9307
>> Fax: 801-422-0008
>> USA
>>>>>> --
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/
>> Chair
> RSG-Worldwide
> ISCB-Student Council
>http://iscbsc.org/rsg>> www.bioinfosolutions.com
> _______________________________________________
> Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org>http://lists.open-bio.org/mailman/listinfo/bioperl-l>>