Re: replace chars



From: "Dr.Ruud" <rvtol+news@xxxxxxxxxxxx>

"Octavian Rasnita" schreef:
I have also seen that length($string) returns the number of bytes of
$string, and not the number of chars (if the string contains UTF-8
chars).

This tells me that you are taking input from an octet buffer that comes
from outside.

Yes, I am getting it from a SQLite database.

my $octets = <>;
my $string;
eval {
$string = Encode::decode("utf8", $octets, Encode::FB_CROAK);
1;
} or {
# malformed input
}


Ok, I can get the size of the string using this code, but please tell me how to get the UTF-8 chars from this string.

After decoding the octets, if I do

my @chars = split //, $string;

then it also returns the octets separately and not the UTF-8 chars.

Thanks.

Octavian

.



Relevant Pages

  • Re: replace chars
    ... > Yes I think that it might not be any standard transforming algorithm> for ... > I hope I won't have any issues, because the chars are UTF-8. ... Perl strings are in UTF-8*, but if you want to specify a character ... I have also seen that length($string) returns the number of bytes of $string, and not the number of chars. ...
    (perl.beginners)
  • Re: [Emacs] Kommentieren
    ... ;; completely up to the user to decide, what the string ... "Chars preserved of STRING. ... `CHARS-PRESERVE' must be a parentized expression, ...
    (de.comp.editoren)
  • Re: FASTEST way to try all strings (a until ZZZZZZZZZZZZZZZZZZZZZZZZ)
    ... > It will be a very huge table so I in my opinion. ... > When it would be used, than it should be converted to a string, however ... >> How would an array of Byte be any faster then an array of Char? ... >> array of Byte is needed, however the OP suggested Chars (A to Z, a to z ...
    (microsoft.public.dotnet.languages.vb)
  • Re: FASTEST way to try all strings (a until ZZZZZZZZZZZZZZZZZZZZZZZZ)
    ... > It will be a very huge table so I in my opinion. ... > When it would be used, than it should be converted to a string, however ... >> How would an array of Byte be any faster then an array of Char? ... >> array of Byte is needed, however the OP suggested Chars (A to Z, a to z ...
    (microsoft.public.dotnet.general)
  • Re:(9corr) string
    ... and I want to remove the month in the timestamp for each of the string ... Then advance 4 chars, and copy from +2 to the current pointer until you ... pointer past the year part to reach the month part. ... this guarantee is 'memmove'. ...
    (comp.lang.c)