Re: convert protein fasta stream into harsh table



On 2006-02-28, zhong.huang@xxxxxxxxx <zhong.huang@xxxxxxxxx> wrote:

Can anyone suggest me a simple way to convert multiple sequences fasta
(in Bio::SeqIO object) into harsh table (sequence annotation as key,
sequence as value)?

It's a hash, not harsh (at least in English). I suppose it could be
a harsh hash.

I want to have the harsh table %seqharsh to hold sequences like this:

# my %seqharsh = ('seq1', MAAASAVSVL......',
# 'seq2', MAPPSVFAEVPQ......,);

It's not such great style to write it this way, even though
we know what you mean. Try this instead:

# my %seqharsh = ('seq1' => 'MAAASAVSVL......',
# 'seq2' => 'MAPPSVFAEVPQ......',);

Under use strict the quotes are required on the right-hand side.
The => behaves like , in this context, but looks nicer for humans.

My code is like this:


my $seqio = new Bio::SeqIO(-format => $format,
-file => $file);

while ( my $seq = $seqio->next_seq ) {
if( $seq->alphabet ne 'protein' ) {
confess("Skipping non protein sequence...");
next;
}

#write code here to assign each entry into harsh %seqharsh
my $seqharsh{$seq->primary_id} = $seq->seq();

This line will not do what you want. AFAIK you can't even my
a hash key/value. Remove the my and see if you get what you want.
If running under use strict, as you should, you will want to define
my %seqharsh (or my %seqhash) before the while loop begins.

If you have further troubles, you should be sure to post a
short but complete script that contains your difficulty, as
well as the expected and actual results, as described in the
posting guidelines for this newsgroup.

--keith


--
kkeller-usenet@xxxxxxxxxxxxxxxxxxxxxxxxxx
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom
see X- headers for PGP signature information

.