XML::LibXML navigation



Hi,

I have to do some sanity checks on a large xml file of addresses
(snip below). I have been using XML::LibXML and seem to have started
ok but I am struggling to navigate around a record.

In the sample date below your'll see some addresses with "DO NOT..."
in. I can locate them easily enough but I am struggling to navigate
back up the DOM to access the code so I can record the code with
faulty addresses.

Here my effort. Can anyone help me either to move backup up to the
right element node or catch the code node before I begin to loop
through the address line(s).

TIA,
Dp.


======= My Effort ==========
#!/usr/bin/perl

use strict;
use warnings;
use XML::LibXML;

my $file = 'ADDRESS.XML';
open(FH,$file) or die "Can't open file $file: $!\n";

my $parser = XML::LibXML->new;
my $doc = $parser->parse_fh(\*FH);

my @results = $doc->findnodes('//address');

foreach my $i (@results) {
my @addlines = $i->findnodes('//line');
foreach my $l (@addlines) {
if ($l->string_value =~ /\s+NOT\s+/) {
my $p = $i->nodePath;
$p .= '/code';
print $p->nodeValue,"\t";
print $l->string_value, "\t";
print $l->string_value, "\n";
}
}

}
=============================

=========== Sample Data ==========
<?xml version = "1.0" encoding= "utf-8"?>
....snip
<address number="1016">
<code>B679OOO00</code>
<record_type>client</record_type>
<address_type>shipping</address_type>
<Postcode></Postcode>
<Country>GBR</Country>
<lines>
<line>DO NOT USE THIS CODE</line>
</lines>
</address>
<address number="1014">
<code>P982LUS00</code>
<record_type>client</record_type>
<address_type>shipping</address_type>
<Postcode>HR2 0AU</Postcode>
<Country>GBR</Country>
<lines>
<line>UPPER HOUSE FARM</line>
<line>BACTON</line>
<line>ESSEX</line>
<line>EX2 0AU</line>
</lines>
</address>
<address number="1333">
<code>A234ULE00</code>
<record_type>client</record_type>
<address_type>shipping</address_type>
<Postcode></Postcode>
<Country>AND</Country>
<lines>
<line>QUEENS HOUSE</line>
<line>1 BUCKINGHAM PALACE</line>
<line>LONDON WC2H</line>
<line>****NOT AT THIS ADDRESS ANY
MORE.</line>
<line>***************</line>
</lines>
</address>
<address number="1018">
<code>A&amp;MPUB00</code>
<record_type>client</record_type>
<address_type>shipping</address_type>
<Postcode>PO19 8SQ</Postcode>
<Country>GBR</Country>
<lines>
<line>THE ATRIUM</line>
<line>SOUTHERN GATE</line>
<line>CHICHESTER</line>
<line>SUSSEX</line>
<line>PO19 8SQ</line>
</lines>
</address>

.



Relevant Pages

  • Re: XML::LibXML navigation
    ... I have to do some sanity checks on a large xml file of addresses. ... I can locate them easily enough but I am struggling to navigate back up the DOM to access the code so I can record the code with faulty addresses. ... foreach my $l { ...
    (perl.beginners)
  • Re: csproj/vbproj definition available?
    ... and A Project represents an MSBuild ... save to an XML file, preserving most whitespace and all XML comments." ... foreach ... foreach (BuildItemGroup itemGroup in project.ItemGroups) ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Best thing to simulate an Ini file
    ... > Use a simple class to read and write to a configuration file on either the ... Here's a sample of the xml file and a class to ... > foreach (XmlNode node in configDocument.ChildNodes) ...
    (microsoft.public.dotnet.framework.compactframework)
  • Webbrowser.navigate() not firing
    ... Basically all this script does is create an xml output file. ... DocumentCompleted ALWAYS fires (as if the navigate ... new xml file is created. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Webbrowser.navigate() not firing
    ... Basically all this script does is create an xml output file. ... DocumentCompleted ALWAYS fires (as if the ... navigate fires). ... a new xml file is created. ...
    (microsoft.public.dotnet.languages.csharp)