Parsing XML data as it arrives from LWP call
From: Steve B (sborruso_at_austin.rr.com)
Date: 01/11/05
- Next message: Sherm Pendley: "Re: Modules installed with PPM -- how to install on air-gapped machines"
- Previous message: Jacob: "Re: Modules installed with PPM -- how to install on air-gapped machines"
- Next in thread: Bart Lateur: "Re: Parsing XML data as it arrives from LWP call"
- Reply: Bart Lateur: "Re: Parsing XML data as it arrives from LWP call"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 11 Jan 2005 00:57:33 GMT
Greetings,
I am trying to improve the performance of a Perl application that uses LWP
to
request, and then parse XML (via XML::Node) data from eBay using their
prescribed developer API.
I would like to parse the resulting XML data as it's being returned as
opposed to waiting for all the data to arrive as my code is designed now.
The existing high level flow is as follows -
Build request XML input parms
Request the XML from eBay
Register XML::Node XML tag/variable names to save and subroutines to call
during the XML::Node parse
The registered subroutines are called by XML::Node based on XML start/end
structures
Parse the returned XML data using the Perl XML::Node package.
I have another LWP routine that parses for image data as it arrives. It
requests a page and parses all the <img> tags but I'm having difficulty
converting the XML::Node routine to use this same design.
This is my attempt so far to redesign my XML:Node parsing to perform the
parse while data is being returned -
$posturl = 'https://api.' . $eBayURL . '/ws/api.dll';
sub LWPcallback {
$XML_reply = @_->content;
?????????
}
$objUserAgent = LWP::UserAgent->new; # Create user agent
$objRequest = HTTP::Request->new("POST", $posturl, $objHeader, $request); #
Build the request
$xml_node = XML::Node->new(&LWPcallback); # Define the parser
# Register XML tag/variables to save, and subroutines to call, during the
parse
$xml_node->register(">eBay>SellerList>Item>Id","char" => $api_itemnum);
$xml_node->register(">eBay>SellerList>Item>SiteId","char" =>
$api_siteid);
$xml_node->register(">eBay>SellerList>Item","end" => &handle_item_end);
etc.
$objResponse = $objUserAgent->request($objRequest,
sub{$xml_node->parse($_[0])}); #Issue request and parse response
I'm confused on what/when/where the parsing will take place in this scenario
and have the following initial questions -
1) What processing needs to take place in the LWPCallback subroutine ?
2) Do I need to re-register the XML::Node tag/variables and subroutines in
the callback ?
This is the working image parse code that I based the above design on -
$ua = new LWP::UserAgent;
@imgs = ();
sub LWPcallback {
my($tag, %attr) = @_;
return if $tag ne 'img'; # we only look closer at <img ...>
push(@imgs, values %attr);
}
$LWP_p = HTML::LinkExtor->new(&LWPcallback);
$res = $ua->request(HTTP::Request->new(GET => $LWP_itemurl),
sub{$LWP_p->parse($_[0])});
my $base = $res->base;
@imgs = map { $_ = url($_, $base)->abs; } @imgs;
My environment -
OS - Red Hat EL 3 (Intel)
Perl Version - v5.8.1 built for i686-linux
Perl Modules -
perl-libwww-perl-5.65-6
>> rpm -qa | grep XML
perl-XML-Dumper-0.4-25
perl-XML-Twig-3.09-3
perl-XML-Encoding-1.01-23
perl-XML-Grove-0.46alpha-25
PyXML-0.7.1-9
perl-XML-Parser-2.31-15
Any assistance or advice is most appreciated.
Thanks,
Steve
- Next message: Sherm Pendley: "Re: Modules installed with PPM -- how to install on air-gapped machines"
- Previous message: Jacob: "Re: Modules installed with PPM -- how to install on air-gapped machines"
- Next in thread: Bart Lateur: "Re: Parsing XML data as it arrives from LWP call"
- Reply: Bart Lateur: "Re: Parsing XML data as it arrives from LWP call"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|