Re: Calculating values of a 2d array by comparison of 2 strings



"Pepetideo" <pepetideo@xxxxxxxxxxx> wrote:
anno4000@xxxxxxxxxxxxxxxxxxxxxx wrote:
Pepetideo <pepetideo@xxxxxxxxxxx> wrote in comp.lang.perl.misc:
Hi. I am having a hard time dealing with a problem And I was wondering
if anyone has a suggestion to how I should do this.

I have 2 strings that I want to compare. These strings are composed of
amino acid residues (long sequences of any of 20 characters) these
string are aligned to each other and I want to compare each position of
the two strings and fill up a matrix of 21*21 so that for eg:

AAAAAA....
ATLL_Y....

1 pos: A-A -> counts 1 and adds to position [0][0] on the matrix
2 pos: A-T -> counts 1 and adds to position [0][1] on the matrix
3 pos: A-L -> counts 1 and adds to position [0][2] on the matrix
4 pos: A-L -> counts 1 and adds to position [0][2] on the matrix
5 pos: A-_ -> counts 1 and adds to position [0][21] on the matrix
6 pos: A-Y -> counts 1 and adds to position [0][10] on the matrix
...

I've been suggested by a friend to use a switch but this ends up needing
441 cases of assignments in other to fill up every position of the matrix.

Does anyone know of a better way of dealing with this? I appreciate any
suggestions. Thanks

Probably. Use a two-dimensional hash to count the combinations
directly, without mapping the letters to array indices. Here is how:

my $str1 = 'AAAAAA';
my $str2 = 'ATLL_Y';

my %count;

my @str1 = split //, $str1;
my @str2 = split //, $str2;

for my $ch1 ( @str1 ) {
my $ch2 = shift @str2;
++ $count{ $ch1}->{ $ch2};
}

# example:
print "The count for 'A' and 'L' is $count{ A}->{ L}\n";

If you do need the array form you can generate it from the count hash.


Thank you very much for the suggestion... I think this is a good idea...
My doubt is that I am going this to many sequences and the arrays have
positions have to be the same... in order th convert the count ashe to
the matrix would i not need to create 431 assignment statements like :

@matrix[0][0] = $count{a}->{l};


Maybe not... Here's a variation on Anno's method. I assume you have a list
of characters; from your example, we know that A is 0, T is 1, etc. Make a
string of the characters where each character's position in the string is
the same as its index in the matrix. From the string of characters, make a
hash that maps each character to its index. Then count in @matrix.

I think this is the same as Anno's idea of generating the matrix from the
count hash, but it shortcuts the creation of the hash using the characters
from your amino acid residues.

my $chars = 'ATL-------Y----------_';

my $i = 0;
my %chNdx = map { $_ => $i++ } split //, $chars;

my $str1 = 'AAAAAA';
my $str2 = 'ATLL_Y';

my @matrix;

my @ndx1 = map { $chNdx{$_} } (split //, $str1);
my @ndx2 = map { $chNdx{$_} } (split //, $str2);

for my $i (@ndx1) {
my $j = shift @ndx2;
++ $matrix[$i][$j];
}

Mark

.



Relevant Pages

  • Re: Base36
    ... static string tokens = ... But - I don't think you want all those silly characters in the product key. ... I should be able to recalc the hash at the client ... > conversion to long so I can pass each long to the BaseXX converter to get ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to omit blank spaces in the text?
    ... Private Sub Command1_Click ... Dim ssql As String, ssql2 As String, ssql3 As String, trimname As String ... character set from &H21 to &H7E provides for ASCI alpha numeric characters ... >> When the password is first created you calculate the hash and store>> that, ...
    (microsoft.public.vb.general.discussion)
  • Reducing the chance of collisions in known encryption systems
    ... is needed is any password that will result in the captured hash. ... A very long string of random characters ... Firstly a rule is decided upon so that we dont make this terminally ...
    (sci.crypt)
  • Re: Prothon should not borrow Python strings!
    ... """It does not make sense to have a string without knowing what encoding ... same cul de sac as Python. ... Prothon_String_As_ASCII // raises error if there are high characters ... Python's split between byte strings and Unicode strings is ...
    (comp.lang.python)
  • Re: all matches of a regex-continued
    ... Rob has suggested me a good solution for macthing consecutive patterns like ... > Your suggestion was quite helpful but I got stuck when I try to modify it for ... which takes a string of characters to ...
    (perl.beginners)