Re: Converting "’" to an Apostrophe?
- From: RedGrittyBrick <RedGrittyBrick@xxxxxxxxxxxxx>
- Date: Thu, 28 Feb 2008 10:48:43 +0000
RedGrittyBrick wrote:
maria wrote:On Wed, 27 Feb 2008 22:45:02 -0500, "John W. Kennedy"
<jwkenne@xxxxxxxxxxxxx> wrote:
maria wrote:I am using a CGI program to read XML files and extract their variousYou have a serious misunderstanding that is much too complicated to explain here. Learn about Unicode.
items. Somehow, my program converts the apostrophe "’" to ...
"\â\€\™". How do I program my CGI program to convert "’" to
an apostrophe, "'"? Is there a little CGI code that will convert
all these different strings (including dagger, ellipsis, euro symbol, double quote, etc.) to their ASCII equivalents?
Thank you very much.
maria
The whole modern world is filled with people who feel compelled to
respond to other people's messages when they have absolutely nothing
to say.
Oh dear. Replying to percieved rudeness with more rudeness just puts off potential helpers.
John's reply *did* contain something useful to you.
AIUI John is pointing out that "\â\€\™" is your Unicode apostrophe encoded in UTF-8 but displayed using an incorrect encoding such as Latin-1.
Unicode code-point u2019 is represented in UTF8 as the byte sequence e2 80 99 (shown here in hexadecimal), that same byte sequence, when interpreted as Latin-1 is the three characters ’ (a acute, euro, trademark).
You can learn more about Perl's handling of unicode by typing the command `perldoc perlunicode`
It's a while since I've read the posting guidelines for this newsgroup but I'm pretty sure they suggest you include a short example program that demonstrates your problem. That would make it easier for people to help you identify what you are doing wrong.
#!perl
#
# Demonstrate handling of Unicode characters in a UTF8 encoded file
#
# RGB 2008-02-28
#
use strict;
use warnings;
use CGI qw/:standard/;
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);
#
# First we write some Unicode to a file using UTF-8 encoding.
#
my $tempfile = "unicode.txt";
open (my $out, '>:utf8', $tempfile)
or die "can't open $tempfile because $!\n";
print $out "Here is a Unicode RIGHT SINGLE QUOTE MARK ->\x{2019}<-\n";
close $out;
#
# Now we read our UTF-8 encoded text file and use it in a web-page.
#
open (my $in, '<:utf8', $tempfile)
or die "can't open $tempfile because $!\n";
my $line = <$in>;
close $in;
print header(-charset=>'utf-8'), # NOTE - Default is NOT utf-8
start_html(), h1("Unicode example"), p($line), hr(), end_html();
.
- References:
- Converting "’" to an Apostrophe?
- From: maria
- Re: Converting "’" to an Apostrophe?
- From: John W. Kennedy
- Re: Converting "’" to an Apostrophe?
- From: maria
- Re: Converting "’" to an Apostrophe?
- From: RedGrittyBrick
- Converting "’" to an Apostrophe?
- Prev by Date: Re: RegEx - matching previous match
- Next by Date: Magic $a $b
- Previous by thread: Re: Converting "’" to an Apostrophe?
- Next by thread: Re: Converting "’" to an Apostrophe?
- Index(es):
Relevant Pages
|
|