The protein ID is stored in $dblink->optional_id
This is the code which does the parsing work in Bio::SeqIO::swiss to
make a DBlink Xref.
elsif (/^DR\s+(\S+)\;\s+(\S+)\;\s+([^;]+)[\;\.](.*)$/) {
my $dblinkobj = Bio::Annotation::DBLink->new();
$dblinkobj->database($1);
$dblinkobj->primary_id($2);
$dblinkobj->optional_id($3);
my $comment = $4;
if(length($comment) > 0) {
# edit comment to get rid of leading space and trailing
dot
if( $comment =~ /^\s*(\S+)\./ ) {
$dblinkobj->comment($1);
} else {
$dblinkobj->comment($comment);
}
}
$annotation->add_Annotation('dblink',$dblinkobj);
}
-jason
On Oct 27, 2004, at 11:47 AM, Anand Venkatraman wrote:
> Hi,
>> Thanks a lot for the response.
>> Some clarifications from my side:
>> [1] Yes, by the EMBL tag, I catually meant the DbXREFto EMBL for the
> specific SwissProt accession number. Sorry for the confusion. Lets
> say we have this line from a SwsisProt record:
>> DR EMBL; X57346; CAA40621.1; -.
>> By the method outlined in my code, I am able to pull up only the EMBL
> nucleotide accession number (X57346) , but I am unable to get to the
> Protein Accession Number (CAA40621.1).
>> [2] Problems with GO cross-references:
>> I can send you a small portion of the SwissProt file -- do you want me
> to send it as an attachment or within the text of the message. Can we
> send file attachments to the mailing list?
>>> Thanks a lot.
>> Anand
>> Hilmar Lapp <hlapp at gmx.net> wrote:
>> On Tuesday, October 26, 2004, at 09:44 PM, Anand Venkatraman wrote:
>>> Hi,
>>>> I am using Bioperl to parse SwissProt Records.
>>>> The bioperl version is 1.4.
>>>> I am having 2 problems :
>>>> Problem 1: I am unable to get all the accession
>> numbers from the line starting with AC on the
>> SwissProt Record.
>> Other accessions than the first are available via
> $seq->get_secondary_accessions().
>>>>> Problem 2: I am also trying to get the associated
>> EMBL and GO cross-references fro a given Swissprot
>> entry. The problem I am having is that
>> [a]: I am only getting the Nucleotide Id and Not the
>> Protein Id from the EMBL tag and
>> What do you mean by EMBL tag? Dbxrefs to EMBL?
>>> [b]: In some cases, I am unable to get the GO ids.
>> This should not happen. Can you send the accession numbers for those
> sequences, or better yet, the swissprot-formatted file with those (or a
> selection thereof) that fail?
>> -hilmar
>>>> For
>> example, from the code below, I am only getting the GO
>> id for some records, and missing it for some. Also, if
>> a particular record has 3 or 4 lines of GO, the code
>> just captures the 1st occurence of the GO Id(if and
>> when it does so).
>>>>>>>> This is the code
>> -------------------------------------------------------
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::SeqIO;
>>>> my $sp_file = shift @ARGV or die$!;
>> my $seqio_object = Bio::SeqIO->new(-file => $sp_file,
>> -format => "swiss");
>>>> while (my $seq_object = $seqio_object->next_seq) {
>> if ($seq_object->species->binomial =~ m/Homo
>> sapiens/) {
>> print "Accession:
>> ",$seq_object->accession_number(), "\t";
>> my $annotation = $seq_object->annotation();
>>>> foreach my $dblink (
>> $annotation->get_all_Annotations('dblink') ) {
>>>> if ( ( $dblink->database eq "EMBL" ) || (
>> $dblink->database eq "GO" ) ) {
>> print "\t",$dblink->database, ":",
>> $dblink->primary_id, "\t";
>> }
>> }
>> }
>> print "\n";
>>>> }
>>>> -------------------------------------------------------
>>>> Any suggestions,
>>>> Thanks in advance for the help.
>>>> Anand
>>>>>>>>>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Mail - You care about security. So do we.
>>http://promotions.yahoo.com/new_mail>> _______________________________________________
>> Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org>>http://portal.open-bio.org/mailman/listinfo/bioperl-l>>>>> --
> -------------------------------------------------------------
> Hilmar Lapp email: lapp at gnf.org
> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> -------------------------------------------------------------
>>>>> ---------------------------------
> Do you Yahoo!?
> Yahoo! Mail Address AutoComplete - You start. We
> finish._______________________________________________
> Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org>http://portal.open-bio.org/mailman/listinfo/bioperl-l--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu