Re: attempting to print unicode characters.



Ben Bacarisse wrote:

Ray <bear@xxxxxxxxx> writes:

Hi. I'm trying to print Unicode characters to standard output, and
failing.....
But I haven't been able to get wide-character output from a C
program.

Which is not what you want. You may use the wide output functions but
the output you want is multi-byte not wide. As has already be
mentioned, setlocale is the key here. Once the C run-time knows the
final output encoding required you can print from wide strings or
multi-byte stings and both will work.

okay, fourth attempt:

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <assert.h>
#include <locale.h>

int main(){
wchar_t vowel;
char utf[8];

/* set output stream to wide-character mode, or halt. */
assert(fwide(stdout, 1) > 0);
assert (setlocale(LC_ALL, "en_US.utf8") != NULL);
wprintf(L"ä \n");
}

This prints a lower-case 'a'. That's better, but still wrong.
Does "en_US.utf8" suppress accents?

fifth attempt:

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <assert.h>
#include <locale.h>

int main(){
wchar_t vowel;
char utf[8];

/* set output stream to wide-character mode, or halt. */
assert(fwide(stdout, 1) > 0);
assert (setlocale(LC_ALL, "POSIX") != NULL);
wprintf(L"ä \n");
}

does not change anything, this still prints a lower-case 'a'.

Hmm, whatever 'locale' the darn terminal is using allows ä to show up,
so against the advice of another poster I'll try the empty string with
setlocale().

sixth attempt:

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <assert.h>
#include <locale.h>

int main(){
wchar_t vowel;
char utf[8];

/* set output stream to wide-character mode, or halt. */
assert(fwide(stdout, 1) > 0);
assert (setlocale(LC_ALL, "") != NULL);
wprintf(L"ä \n");
}

Changes nothing. it *still* prints a lower-case 'a'.

locale -a on my system returns
C
en_US.utf8
POSIX

the first is the default locale for C programs and restricted to 7-bit
characters according to the setlocale manpage.

the second is what my term programs are set to, and they show most unicode
characters fine.

But none of them work. /usr/share/i18n/SUPPORTED lists 417 more. I decided
I would try the a german locale, since it was explicitly recommended
upthread.

seventh attempt:


sixth attempt:

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <assert.h>
#include <locale.h>

int main(){
wchar_t vowel;
char utf[8];

/* set output stream to wide-character mode, or halt. */
assert(fwide(stdout, 1) > 0);
assert (setlocale(LC_ALL, "de_DE.UTF-8") != NULL);
wprintf(L"ä \n");
}

This time the call to setlocale returned NULL so the assert failed.
I suppose that means I need to download the corresponding locale data
before I can do that?

Bear,
still having no luck....


.



Relevant Pages

  • Re: Validating multibyte strings
    ... > int mbcheck(const char *s) { ... > Does mblenrely on a locale being set? ... variadic function (such as printf()) are actually passed as ints. ...
    (comp.lang.c)
  • Re: attempting to print unicode characters.
    ... int main{ ... wchar_t vowel; ... char utf; ... Hmm, whatever 'locale' the darn terminal is using allows ä ...
    (comp.lang.c)
  • Re: Solaris Locale
    ... I was working on a solaris machine, where when i used to run ... int main(int argc, char* argv) ... This has the advantage of always setting the locale back to what you want at each login. ...
    (comp.unix.solaris)
  • Re: toupper and locale
    ... > int main ... > char before,after; ... > hex value:ffffffe4 ... > locale is listed. ...
    (comp.lang.c)
  • Re: istringstream not working the same in Visual Studio 2005
    ... >> can not handle an int, ... > Looks like the infamous locales problem. ... > picked up the Windows locale. ... belatedly that the code now eats trailing commas. ...
    (microsoft.public.vc.language)