Re: HTML Parser

From: Kees Vermeulen (info_at_kever.com)
Date: 10/19/04


Date: Tue, 19 Oct 2004 10:46:54 +0200

Avatar,

Thanks for your extensive reply, however some questions remain.

What I'm trying to do is building a kind of document merge application.
Users can create email-templates using Microsoft's HTML editor components
and (therefore) these templates are stored as HTML files. Field elements are
indicated using the following HTML tag:

<span datafld="A Fieldname">Some Text</span>

However, because my users are no computer experts, I am thinking about
changing this to a simple: '<<A FieldName>>' (no tags, just plain text).

Now I want to merge this HTML document with data either stored in a xml
document or database. For this I load the HTML in a TWebBrowser control and
then run through the document. Every '<span/>' tag is replaced by data taken
from the database.

I agree with you that XML/XSL would be great for document merging but I
don't see how users having no knowledge of computers can create XSL
documents with ease. That's why I chose HTML.

Using TWebBrowser has some disadvantages:
- The HTML document can only be accessed after it has been processed and
this requires a visual control;
- Sometimes replacing a tag with new data results in an empty document.

I hope, after reading this info, you have additional tips for me on how to
implement such an application.

Regards,

Kees Vermeulen

"Avatar Zondertau" <avatarzondertau@hotmail.com> wrote in message
news:4174bda0@newsgroups.borland.com...
>> I am looking for an HTML parser which can also replace certain tags
>> with other data. I tried using Microsofts HTML but I am having some
>> problems with it.
>
> You should use XML and XSLT instead. Make sure your HTML document is
> well formed:
>
> - You should close every tag you open, so for example <br> will become
> <br />
>
> - You should use lowercase for HTML tags
>
> - You should put script blocks between <![CDATA[ and ]]>
>
> - You should replace entity references other than &amp;, &lt; and &gt;
> with character codes
>
> This shouldn't be too much work if the original HTML is formatted
> nicely.
>
> Now you can use XSLT to replace the tags you want to replace, leaving
> the other ones alone by just copying them. This approach is very
> flexible, because you can create any formatting you like just by
> modifying the XSLT file. Also you can now use Microsoft's XML parser,
> which is IMHO pretty good. To use it use Project > Import type lib and
> select one of the "Microsoft XML" entries.
>
> Information about XSLT can be found here:
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk/
> html/xmrefxsltreference.asp
>
> Info on Microsoft's XML DOM parser can be found here:
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnxml/h
> tml/beginner.asp



Relevant Pages

  • Re: Getting NASM from C
    ... > format we were going to edit it in... ... all XML is? ... HTML is a specific "subset" of XML for displaying ... it's a text file with "tags" inside ...
    (alt.lang.asm)
  • Re: Newbie, HTML calling XSLT
    ... I would like to write a simple HTML page that calls ... > XSLT and passes it parameters. ... other flavour of XML) because XSLT 1.0 only processes XML. ... > programming language like Perl or php to achieve this? ...
    (comp.text.xml)
  • Re: XHTML Doctype?
    ... One friend has said "but you just use XSLT to translate it into ... HTML -- that's not really serving XML, ...
    (comp.text.xml)
  • Re: XML in XHTML
    ... > My problem is that javascript is understanding the nodes in my xml ... I take it you have embedded this as an 'xml data island' inside your HTML. ... contents as anything other than more HTML tags to parse. ...
    (comp.lang.javascript)
  • Re: [ANN] HTMLTemplate 1.0.0
    ... >>develops the XML elements for the layout. ... to implement) and EZT templates and ... ... > code-up your own tags and sprinkle them throughout your HTML. ...
    (comp.lang.python)