'medium' reg exp greediness?
- From: alancam73@xxxxxxxxx (Alan Campbell)
- Date: Sat, 29 Apr 2006 04:48:59 -0700 (PDT)
hello folks,
I'm trying to do a 'medium' greediness regular expression. Here's what I mean. I need to grab all of the DW_TAG_TI_reserved stuff and kill it (italics below)
<die id='0x157'>
<tag>DW_TAG_TI_assign_register</tag>
<attribute>
<type>DW_AT_location</type>
<value>
<block>DW_OP_reg0</block>
</value>
</attribute>
</die>
<die id='0x8903'>
<tag>DW_TAG_TI_reserved_1</tag>
<attribute>
<type>DW_AT_name</type>
<value>
<string>C:\DOCUME~1\A0741153\LOCALS~1\Temp\TI1564:L2:2:1088629783</string>
</value>
</attribute>
<die id='0x8951'>
<tag>DW_TAG_TI_reserved_2</tag>
<attribute>
<type>DW_AT_low_pc</type>
<value>
<addr>0x1b84</addr>
</value>
</attribute>
</die>
</die>
<die id='0x130'>
<tag>DW_TAG_variable</tag>
<attribute>
<type>DW_AT_name</type>
<value>
<string>TSK_thingyIneedToKeepThis</string>
</value>
</attribute>
</die>
I did the following for killing DW_TAG_TI_assign_register.
$all_lines =~ s/<die id\S*>\s*<tag>DW_TAG_TI_assign_register.*?<\/die>//sg;
That worked fine. But the DW_TAG_TI_reserved stuff is nested. I need medium greediness ie .* (where . also matches newline via /s) without a ? would go too far ie it would grab everything up until last </die> which is too much....kills stuff I need to keep. But .*? is too lazy...it doesnt handle the nesting ie only kills up until the first </die>.
To further complicate life, I cant guarentee the level of nesting.
Any ideas on how best to reg exp this? Or do I just need to improve/narrow my search string.
Many thanks indeed.
cheers, Alan
---------------------------------
How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone call rates.
- Follow-Ups:
- Re: 'medium' reg exp greediness?
- From: Jay Savage
- Re: 'medium' reg exp greediness?
- From: DJ Stunks
- Re: 'medium' reg exp greediness?
- Prev by Date: Re: reading web page content
- Next by Date: Re: Replying to the perl list (was Re: problem with whitespace not splitting on split.)
- Previous by thread: reading web page content
- Next by thread: Re: 'medium' reg exp greediness?
- Index(es):