Re: [PHP] Re: languages and PHP




On Sep 28, 2007, at 11:34 AM, tedd wrote:

At 2:01 PM -0500 9/27/07, Edward Vermillion wrote:
So back to my original question, what breaks if you're *expecting* UTF-8 and you don't *get* UTF-8?

Ed

Isn't UTF-8 the big fish here?

Sure there' UTF-16 and larger, but everything else is a subset of UTF-8, is it not?

So, what's the problem if you get a character defined by ISO -- it's still within the UTF-8 super-group, right?

The only problem I see here is IF the user has the char set to display the glyph correctly -- OR am I off on something else that you guys aren't even discussing?


Probably very relevant to the original question, but...

My question was more mental prodding than anything else. The OP had a function to convert incoming text into UTF-8 before they did anything with it. A couple of folks said that was unnecessary, if you set your form to UTF-8 your incoming data will be in UTF-8 already.

I was just trying to make the point that if you expect your incoming data to be in a certain state in your code you should make sure that it is in that state before you act on it, since you can't guarantee it's source. Checking to make sure the incoming data is in it's expected state is not a waste of time (or unnecessary, or whatever term of derision they picked) but is actually good coding practice.

I pretty much gave up on the thread when I got the reply along the lines of "if it breaks something it's their problem, not mine".

Ed
.



Relevant Pages

  • Re: [PHP] Re: languages and PHP
    ... A couple of folks said that was unnecessary, if you set your form to UTF-8 your incoming data will be in UTF-8 already. ... I was just trying to make the point that if you expect your incoming data to be in a certain state in your code you should make sure that it is in that state before you act on it, since you can't guarantee it's source. ...
    (php.general)
  • Re: [PHP] utf-8 in $_POST
    ... Make your page utf-8, all text on it. ... Set utf-8 encoding and treat all ... incoming data as UTF-8. ... forget to set database and tables to UTF-8, that is default in MySQL 4.1 and ...
    (php.general)
  • Re: Unicode Support
    ... > Not knowing much about UTF-8 (my Unicode knowledge extends as far as ... > literal strings of this form as long as the character code for quote ... > can never appear in a MBCS (multibyte character sequence). ... then XP Notepad directly understands UNICODE and you can ...
    (alt.lang.asm)
  • Re: Attention: European C/C++/C#/Java Programmers-Call for Input
    ... Simply make a straight decision now - you will use UTF-8. ... character format) much like UTF-8 which itself ... I would have little more than UNICODE left. ... generator is assembly language. ...
    (comp.arch.embedded)
  • Re: Attention: European C/C++/C#/Java Programmers-Call for Input
    ... No other encodings - no Latin-1, no UTF-16, no home-made character sets, no extra fonts. ... Look at existing tools and source code that supports UTF-8, and see how it can make your work easier and give a result that users might actually be able to *use*. ... A couple of days work here is a drop in the ocean compared to the man-years it will take to work with your home-made encoding, and you will at least have the benefit of a better understanding of your problem. ... I would have little more than UNICODE left. ...
    (comp.arch.embedded)