Parsing XML data as it arrives from LWP call

From: Steve B (sborruso_at_austin.rr.com)
Date: 01/11/05


Date: Tue, 11 Jan 2005 00:57:33 GMT

Greetings,

I am trying to improve the performance of a Perl application that uses LWP
to
request, and then parse XML (via XML::Node) data from eBay using their
prescribed developer API.

I would like to parse the resulting XML data as it's being returned as
opposed to waiting for all the data to arrive as my code is designed now.

The existing high level flow is as follows -

Build request XML input parms
Request the XML from eBay
Register XML::Node XML tag/variable names to save and subroutines to call
during the XML::Node parse
The registered subroutines are called by XML::Node based on XML start/end
structures
Parse the returned XML data using the Perl XML::Node package.

I have another LWP routine that parses for image data as it arrives. It
requests a page and parses all the <img> tags but I'm having difficulty
converting the XML::Node routine to use this same design.

This is my attempt so far to redesign my XML:Node parsing to perform the
parse while data is being returned -

$posturl = 'https://api.' . $eBayURL . '/ws/api.dll';
   sub LWPcallback {
      $XML_reply = @_->content;
      ?????????
   }
$objUserAgent = LWP::UserAgent->new; # Create user agent
$objRequest = HTTP::Request->new("POST", $posturl, $objHeader, $request); #
Build the request
$xml_node = XML::Node->new(&LWPcallback); # Define the parser
# Register XML tag/variables to save, and subroutines to call, during the
parse
   $xml_node->register(">eBay>SellerList>Item>Id","char" => $api_itemnum);
   $xml_node->register(">eBay>SellerList>Item>SiteId","char" =>
$api_siteid);
   $xml_node->register(">eBay>SellerList>Item","end" => &handle_item_end);
   etc.
$objResponse = $objUserAgent->request($objRequest,
sub{$xml_node->parse($_[0])}); #Issue request and parse response

I'm confused on what/when/where the parsing will take place in this scenario
and have the following initial questions -

1) What processing needs to take place in the LWPCallback subroutine ?
2) Do I need to re-register the XML::Node tag/variables and subroutines in
the callback ?

This is the working image parse code that I based the above design on -

$ua = new LWP::UserAgent;
@imgs = ();
sub LWPcallback {
my($tag, %attr) = @_;
return if $tag ne 'img'; # we only look closer at <img ...>
push(@imgs, values %attr);
}
$LWP_p = HTML::LinkExtor->new(&LWPcallback);
$res = $ua->request(HTTP::Request->new(GET => $LWP_itemurl),
sub{$LWP_p->parse($_[0])});
my $base = $res->base;
@imgs = map { $_ = url($_, $base)->abs; } @imgs;

My environment -

OS - Red Hat EL 3 (Intel)
Perl Version - v5.8.1 built for i686-linux
Perl Modules -
perl-libwww-perl-5.65-6
>> rpm -qa | grep XML
perl-XML-Dumper-0.4-25
perl-XML-Twig-3.09-3
perl-XML-Encoding-1.01-23
perl-XML-Grove-0.46alpha-25
PyXML-0.7.1-9
perl-XML-Parser-2.31-15

Any assistance or advice is most appreciated.

Thanks,
Steve



Relevant Pages

  • Re: Sending existing XML document to a document-literal web servic
    ... So the WSDL defines a type ... I don't want to parse the existing document and load the productOrderType ... My understanding is that you can use XmlSerializer to read the XML document ...
    (microsoft.public.dotnet.framework.aspnet.webservices)
  • Re: XML editor using MFC
    ... add elements to the tree control. ... XML tree is represented by an HTREEITEM. ... except for the CDATA stuff which is a real pain to parse. ... Error recovery: simplest form, when you find an error, throw a CException-derived class, ...
    (microsoft.public.vc.mfc)
  • Re: Logfile analysing with pyparsing
    ... You can parse it just once, you just have to setup your data structure ... I inferred this from your XML. ... But looking through the logfile is a time consuming process. ...
    (comp.lang.python)
  • Re: How to get data from txt file into table of word template thro
    ... For the XML data source file, you said Word can parse it by itself. ... but if it's XML why bother with VBA? ... As the data info in the txt.file is not in standard table format, ...
    (microsoft.public.word.vba.general)
  • Repsonse Question
    ... I am sending a request to a webserver and the request I get back is ... XML(contains info to do multiple new requests). ... I need to parse this XML ...
    (microsoft.public.dotnet.framework.aspnet)