Re: Will XML::Simple work with keys, strings, integers, and dates?
From: Jim Gibson (jgibson_at_mail.arc.nasa.gov)
Date: 03/23/05
- Previous message: cji_work_at_yahoo.com: "perl thread question"
- In reply to: Wes Barris: "Will XML::Simple work with keys, strings, integers, and dates?"
- Next in thread: John Bokma: "Re: Will XML::Simple work with keys, strings, integers, and dates?"
- Reply: John Bokma: "Re: Will XML::Simple work with keys, strings, integers, and dates?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 22 Mar 2005 18:13:45 -0800
In article <th6%d.4321$C7.2935@news-server.bigpond.net.au>, Wes Barris
<noway@nohow.com> wrote:
> Hi,
I haven't seen anybody else post a response, so I will give it a shot.
I have only done a little XML processing in Perl (and some in Java),
but I am not an expert.
>
> I am trying to use XML::Simple to parse an xml file. However, the xml file
> that I am trying to parse is not in the same format as any of the XML::Simple
> examples that I have seen. In all of the examples I have seen, the xml tags
> are specific to their contents. In my xml file, the tag names are generic.
> Here is a short sample of the xml that I am trying to parse:
>
> <dict>
> <key>35</key>
> <dict>
> <key>Track ID</key><integer>35</integer>
> <key>Name</key><string>Earache My Eye (Full Version)</string>
> <key>Artist</key><string>Alice Bowie</string>
[XML lines snipped]
Is that really your XML? You have nested tags with the same name:
<dict>. You also have <key> tags at different levels. That is going to
make parsing more difficult.
> </dict>
> <key>36</key>
> <dict>
> <key>Track ID</key><integer>36</integer>
> <key>Name</key><string>Earache My Eye</string>
> <key>Artist</key><string>Cheech & Chong</string>
[more lines snipped]
> </dict>
>
> I would like to be able to extract things like the "Name", "Artist", and
> "Location" but I don't understand how to associate one of the elements of
> the key array with one of the elements of the resulting string array.
You have some very poorly designed XML there. It would be better if it
were something like
<attribute name="Track ID" value="35"> ...
If you cannot change the XML definition, then you are probably better
off using a SAX parser. XML::SAX::PurePerl works, but it is slow. For
big files, try XML::Parser and the expat library. I found it about 75
times faster in my one use.
In a SAX parser, you define a handler package with callbacks that are
called for each element in the XML. Then, you will be able to associate
the <key> value with the subsequent value attribute because you will
get the callbacks sequentially.
Something like this might get you started:
#!/usr/local/bin/perl
use strict;
use warnings;
use XML::SAX::PurePerl;
my $xmlstring;
{
local $/;
$xmlstring = <DATA>;
}
my $handler = My::XML::Handler->new();
my $parser = XML::SAX::PurePerl->new(Handler => $handler);
$parser->parse_string($xmlstring);
package My::XML::Handler;
sub new
{
my $class = shift;
my $self = {
'key' => '',
'data' => ''
};
bless $self, $class;
}
sub start_document{ print "start_document\n"; }
sub start_element
{
my( $self, $element ) = @_;
my $name = $element->{LocalName};
}
sub end_element
{
my( $self, $element ) = @_;
my $name = $element->{Name};
if( $name eq 'key' ) {
$self->{key} = $self->{data};
}elsif( $self->{data} ) {
print "<$self->{key}> is '$self->{data}' of type $name\n";
}
$self->{data} = '';
}
sub characters
{
my( $self, $element ) = @_;
my $chars = $element->{Data};
$self->{data} .= $chars if( $chars =~ /\S/ );
}
sub end_document{ print "end_document\n"; }
sub warning{ print "warning\n"; }
sub error{ print "error\n"; }
1;
package main;
__DATA__
<doc>
<key>35</key>
<dict>
<key>Track ID</key><integer>35</integer>
<key>Name</key><string>Earache My Eye (Full Version)</string>
<key>Artist</key><string>Alice Bowie</string>
</dict>
<key>36</key>
<dict>
<key>Track ID</key><integer>36</integer>
<key>Name</key><string>Earache My Eye</string>
<key>Artist</key><string>Cheech & Chong</string>
</dict>
</doc>
Which produces:
start_document
<Track ID> is '35' of type integer
<Name> is 'Earache My Eye (Full Version)' of type string
<Artist> is 'Alice Bowie' of type string
<Track ID> is '36' of type integer
<Name> is 'Earache My Eye' of type string
<Artist> is 'Cheech & Chong' of type string
end_document
----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
- Previous message: cji_work_at_yahoo.com: "perl thread question"
- In reply to: Wes Barris: "Will XML::Simple work with keys, strings, integers, and dates?"
- Next in thread: John Bokma: "Re: Will XML::Simple work with keys, strings, integers, and dates?"
- Reply: John Bokma: "Re: Will XML::Simple work with keys, strings, integers, and dates?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|