Re: My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)
- From: "Bart Van der Donck" <bart@xxxxxxxxxx>
- Date: 23 Dec 2005 15:31:03 -0800
robic0 wrote:
> On Tue, 20 Dec 2005 23:59:06 -0800, robic0 wrote:
>
> >This post is in response to someone who asked for help trying to
> >parse xml into a data structure.
>
> This will fix the final issues with "ForceArray".
> Comments have an issue with enclosed "<" or ">" in this
> version, other than that they will process normally.
> Its a regex issue (shortcoming in my opinion) that can't
> match a "not" string. Where I need <!--(all but "<!--")-->.
> Where (.*)(?!<!--) won't work in an expression. But I'll
> work around that.
>
> This is version .901 from 12-22-05 is the one you want.
> This is close to the last post as far as this newsgroup.
> Sorry, but I had to get it stable. I've run this on every
> big and wierd xml file I could get my hands on. I'm
> satisfied with it.
[ code snipped ]
It's very hard to run your code. You are messing up the line ends in
your post. I 've uploaded a corrected version to
www.dotinternet.be/temp/code.txt.
Your software produces errors when using namespaces:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:html="http://www.w3.org/TR/REC-html-4.0">
<mytag>content</mytag>
<html:br/>
</root>
Your software produces errors when using a DOCTYPE:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<root>
<mytag>content</mytag>
</root>
Your software produces errors when argument values are enclosed by `` '
´´ instead of `` " ´´:
<?xml version='1.0' encoding='UTF-8'?>
<root>
<mytag myargument='argvalue'>content</mytag>
</root>
XML is case sensitive; your program doesn't seem to bother:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<mYTag myargument="argvalue">content</mytag>
</root>
I'm using Microsoft XP's XML parser to check the XML well-formedness.
Your program has many shortcomings.
--
Bart
.
- Follow-Ups:
- References:
- Prev by Date: Serious Perl Regular Expression deficiency?
- Next by Date: Re: My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)
- Previous by thread: Re: My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)
- Next by thread: Re: My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)
- Index(es):
Relevant Pages
|