Re: Character Entity References



Mark A. Boyd wrote:
Michael Fesser <netizen@xxxxxx> posted in comp.lang.php:

.oO(George Maicovschi)

The problem starting with escaping the input data using htmlentities()
and from my point of view, escaping data before it goes to the DB is a
rather good thing not a bad one.
Escaping yes, but not in this way. Data in a DB should never be stored
in an output-specific or media-dependent encoding, but in a raw format.
Pure data, nothing else. Just think about things like

* output to something else than HTML, for example a PDF or a plain text
newsletter
* a fulltext search

Both tasks will be almost impossible or at least much more complicated
with HTML data in the DB, but pretty easy to do with raw data.

You, Jerry, and others espouse this idea and I certainly understand the merits. But it leaves me with a question.

How do you deal with display data that may be required in both HTML and/or PDF? ie: italic word(s) within the data.

My current solution is storing the <em> tags in the DB, but I don't really like it for the very reasons you stated.



That's a bit more difficult, because you're talking about font information vs. character encoding.

The problem here is also related to searching - for instance, if you have:

'John and Mary hosted a <em>New Year's Eve party</em> for their friends.'

Now - is someone ever going to want to search on "John and Mary hosted a New Year's Eve party"? If so, they'll never find it in a the database because of the embedded <em>.

However, in this case there is no really good answer. You can put the <em></em> in the text as above. You could have multiple rows, each with it's own font information in a separate column (no embedded font info, but still can't easily search for phrases). You can have all of the font column as above and have a second column for searching with "sanitized" text (worst case, IMHO).

Depending on the overall needs, I'll generally pick one of the first two. But neither is ideal.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@xxxxxxxxxxxxx
==================

.