Re: Parsing XML



Hi Rob,

I only have to read values from nodes but I am amazing by what you
say. But the XML file can take 10 Mb.
I thought that expat was the faster parser because I have found this
link http://www.xml.com/lpt/a/37. Maybe it was true in 1999 but it
isn't not at all.

You have to notice that the parser has to be a modular one that I can
call in C++ method.

Could RegExp be more efficient and faster than a classical Parser XPat
or MSXML ?

More than that, I have heard that there is another generation which
combine DOM and SAX parser, it is a microsoft I think but I have lost
the name.

Thanks



On 19 juil, 03:14, rob.di...@xxxxxxx (Rob Dixon) wrote:
Epanda wrote:

Epanda wrote:

I would like to know if we can parse XML with regexp faster than with
an MSXML or Xerces library ?

I just want to parse an XML and I have seen that the XML!!Parser of
Perl based on Expat is the most faster  ofth world, I don't know Twig..

My XML is classical :
<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE CONF_INST SYSTEM "dtd_conf_inst.dtd">

<ROOT_NODE VERS="1.0">
   <NODE1 TAG="VD/N1" SERIAL="3HHE">
           <C>
                   <ID>OM</ID>
                   <VAL>SAT</VAL>
           </C>
           <C>
                   <ID>TPS</ID>
                   <VAL>3E+01</VAL>
           </C>
   </NODE1>
</ROOT_NODE>

but can be very big.

XML::Twig is built on Expat, and is especially good at processing large files
one element at a time instead of loading the whole file into memory first.. For
instance, if your data consists of multiple independent <NODE1> elements
XML::Twig can be set up to process them individually and so save memory. Take a
look herehttp://www.xmltwig.com/xmltwig/

But if you are hoping to write something that is faster than MSXML or Xerces you
may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if
you want to try those.

What do you need to do with the data? It may be possible with regular
expressions if the data is consistently formatted.

Rob

.



Relevant Pages

  • SQLServer 2008 Setup
    ... SupportedOSMessage = Installation of this product failed because it is not supported on this operating system. ... DialogTitle = MSXML 6.0 Parser Setup ...
    (microsoft.public.de.sqlserver)
  • Re: Update KB936181 fails to install
    ... Parser " that was positioned immediately below the other ... Msxml 4.0 installations. ... re-booted..and the microsoft update site tells me there are no ...
    (microsoft.public.windowsupdate)
  • Re: kb 936181 repeated installs
    ... Vista Business ... Parser file was 1.6 mb. ... Update KB936181 fails to install - jaybart ... Msxml 4.0 installations. ...
    (microsoft.public.windowsupdate)
  • Re: Update KB936181 fails to install
    ... Parser " that was positioned immediately below the other ... Msxml 4.0 installations. ... I agree with jaybart, I was also having the same problem, but I didn't ... uninstall the Parser ), ...
    (microsoft.public.windowsupdate)
  • EXPAT 2.0.0 for OpenVMS Alpha 8.2 - GNV style shared image
    ... This is a packaging of EXPAT 2.0.0, build for use with GNV based build environments, and packaged in a PCSI kit. ... Expat is a stream-oriented XML parser. ... handlers with the parser before starting the parse. ...
    (comp.os.vms)