Re: How to put brackets in a string given substrings
- From: ewijaya@xxxxxxxxxxxxxx (Edward WIJAYA)
- Date: Fri, 28 Oct 2005 18:18:05 +0800
Dear John,
Thanks so much for your life saving response. There are one minor issue I still couldn't solve.
It is the fact that when the bounded region marked by the array may occur more than once. (See example no. 7 and 8 in my code below)
To disambiguate the situation, I can give the array that comes along with the index.
I tried to modify your code below to handle the matters. But I still cannot solve it. I think I'm almost there but not quite yet.
Can you advice, how can I go about it? Thanks so much beforehand. Really hope to hear from you again.
__BEGIN__ my $t1 ='CCCATCTGTCCTTATTTGCTG'; my @ar1 = qw(ATCTG-3 ATTTG-13); my $t2 ='ACCCATCTGTCCTTGGCCAT'; my @ar2 = qw(CCATC-2); my $t3 ='CCACCAGCACCTGTC'; my @ar3 = qw(CCACC-0 CCAGC-3 GCACC-6); my $t4 ='CCCAACACCTGCTGCCT'; my @ar4 = qw(CCAAC-1 ACACC-4); my $t5 ='CTGGGTATGGGT'; my @ar5 = qw(GTATG-4 TGGGT-1); my $t6 = 'AGGAACTTGCCTGTACCACAGGAAG'; my @ar6 = qw( CAGGA-18 AGGAA-19 );
#The above example should yield the same result as previously
# These two examples below are the 'ambiguous' cases.
my $t7 = 'CAGGACTTGCCTGTACCACAGGAAG'; my @ar7 = qw( CAGGA-18 ); my $t8 = 'CAGGATTTGAGGAAGTACCACAGGAAG'; my @ar8 = qw( CAGGA-18 AGGAA-19 );
# Answer 7 -- CAGGACTTGCCTGTACCA[CAGGA]AG Instead of -- [CAGGA]CTTGCCTGTACCACAGGAAG
# Answer 8 -- CAGGATTTGAGGAAGTACCA[CAGGAA]G Instead of -- [CAGGA]TTTG[AGGAA]GTACCACAGGAAG
print put_bracket_jk_idx($t8,\@ar8),"\n";
sub put_bracket_jk_idx {
my ( $str, $ar ) = @_; for my $subs ( @$ar ) { my ($sb,$id) = split("-",$subs);
print "$sb $id\n"; if ( substr( $str, $id ) =~ /$subs/i ) {
$id += $-[ 0 ];
substr( $str, $id, length $subs ) =~ tr/A-Z/a-z/;
}
}
$str =~ s/([a-z]+)/[\U$1\E]/g; return $str;
}
print "\n";
__END__
-- Regards, Edward WIJAYA SINGAPORE
On Fri, 28 Oct 2005 18:42:11 +0800, John W. Krahn <krahnj@xxxxxxxxx> wrote:
Wijaya Edward wrote:--
I have the following problem.
I am trying to put the bracket in a string given the set of its substrings.
Those bracketed region is "bounded" by the given substrings.
Like this, given input "String" and it's "substrings"
String 1.CCCATCTGTCCTTATTTGCTG 2.ACCCATCTGTCCTTGGCCAT 3.CCACCAGCACCTGTC 4.CCCAACACCTGCTGCCT 5.CTGGGTATGGGT 6.AGGAACTTGCCTGTACCACAGGAAG
Substrings: 1. ATCTG ATTTG 2. CCATC 3. CCACC CCAGC GCAAC 4. CCAAC ACACC 5. GTATG TGGGT 6. CAGGA AGGAA
The desired answer are: 1. CCC[ATCTG]TCCTT[ATTTG]CTG 2. AC[CCATC]TGTCCTTGGCCAT 3. [CCACCAGCACC]TGTC * 4. C[CCAACACC]TGCTGCCT * 5. CTGG[GTATGGGT] ** 6. AGGAACTTGCCTGTACCA[CAGGAA]G **
Please note that in example 3 and 4 the substrings are "overlapping".
Pay attention also to for example 5 and 6, there exist substrings that occur
twice. So the answer for example 5 and 6 are NOT
5. C[TGGGTATGGGT] ----this is wrong 6. [AGGAA]CTTGCCTGTACCA[CAGGAA]G ----this is wrong
Regards, Edward WIJAYA SINGAPORE>
.
- Follow-Ups:
- Re: How to put brackets in a string given substrings
- From: John W. Krahn
- Re: How to put brackets in a string given substrings
- References:
- Re: How to put brackets in a string given substrings
- From: John W. Krahn
- Re: How to put brackets in a string given substrings
- Prev by Date: Re: help slurping a file
- Next by Date: Re: Is this script safe?
- Previous by thread: Re: How to put brackets in a string given substrings
- Next by thread: Re: How to put brackets in a string given substrings
- Index(es):
Relevant Pages
|