Re: Regular Expression for XML Parsing



tushar.saxena@xxxxxxxxx <tushar.saxena@xxxxxxxxx> wrote:

I have a set of XML files from which I need to extract some data. The
format of the file is as follows :

<tag1>
<tag3>DATA1</tag3>
</tag1>

<tag2>
<tag3>DATA2</tag3>
</tag2>


I thought you said you had an XML file.

That is not a valid XML file...


I need to extract the DATA part of the xml structure

Note : tag3 can be contained either within tag1 or tag2, but I need to
extract data only from tag1. i.e. DATA1 should be extracted, but not
DATA2

If I want to get both DATA1 and DATA2 I can use a simple regex like :


Using a regular expression to "parse" a non-regular language is
fraught with peril, and nearly always a Bad Idea.

Use a module that understands XML for processing XML data.


Any help would be appreciated !


Assuming that you have actual valid XML in $xml, then:

use XML::Simple;

my $ref = XMLin($xml);
foreach my $child ( @{ $ref->{tag1} } ) {
print "$child->{tag3}\n";
}


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
.



Relevant Pages

  • Re: xml in plain text file on heavy load.
    ... even if the XML file is magically and perfectly ... I want to emphasize that IIS would never be caching that XML file on its own ... Application is NOT synchronizing access to your ASP pages. ...
    (microsoft.public.inetserver.iis)
  • Re: TAPI 3.0 call attached data
    ... The format of the Call Attached Data is XML. ... XML file with CallAttchedData represents one or multiple data lists ... Root element is CallAttachedData. ... version CDATA #FIXED "1.0" ...
    (microsoft.public.win32.programmer.tapi)
  • Re: XML parser and writer
    ... them on a calendar. ... Therefore I will need to both easily parse and write new XML files. ... why not some database technology? ... an advanced user can edit the XML file directly at ...
    (comp.lang.java.programmer)
  • Re: Zooming Out: The Larger Issue
    ... XML file or a binary file (smaller and faster to serialize deserialize, ... On startup of your app you could just check if the xml / binary file exists ... datarow = the container of one or more datacolumns wich in there turn hold ...
    (microsoft.public.dotnet.languages.vb)
  • Re: XmlTextReader or XmlDocument or SQLCE
    ... I wouldn't really call it "unrealistic expectations", ... I would love to see a 1 second response time ... reading a single node from a 200k XML file on any version of CF running on ... > As to saving XML file after each change, ...
    (microsoft.public.dotnet.framework.compactframework)