Re: RTF and UTF-8 files in Perl




"V S Rawat" <VSRawat@xxxxxxxxxxxx> wrote in message
news:xn0e93zst5r39c000@xxxxxxxxxxx
> 1. How do I open a RTF file as input in Perl and read formatted
> ASCII text from it?
>

You don't. There's no such thing as formatted ASCII text. You could look
into a module such as RTF::Tokenizer if you want to parse apart the RTF file
and extract the text from it. If you want to know what formatting has been
applied you'll also need to check the formatting commands as you go.

> 2. How do I open a UTF-8 (Unicode) file as output in Perl and
> write Unicode text to it?
>

Assuming your data is not already utf8:

[untested]

use Encode;

open(my $out, '>:utf8', 'somefile.dat') or die "Could not open file: $!";
print $out Encode::encode('utf8', $mydata);
close($out) or die "Could not close file: $!";

Otherwise you could skip the encoding step.

Matt


.



Relevant Pages

  • Re: REPORT FORM xxx TO FILE yyy ASCII (VFP6)
    ... If you want ASCII, you get ASCII. ... > 1) if I'm still designing the report, I want to MODIFY REPORT and then ... > from one of the menu pads in the Report Designer - either from Preview, ... > 2) ASCII loses all formatting, such as fonts, bold/italic, lines. ...
    (microsoft.public.fox.programmer.exchange)
  • Re: RTF-to-ASCII Conversion
    ... Ultimately the Rich Text Boxes are used to populate a document in MS-Word which is then converted to .pdf. ... This is why the formatting is critical. ... So, I want to make sure I am clear, your suggestion/solution is to store the RTF properties in Oracle. ... > composed of nothing but ASCII characters, so I don't know exactly what you ...
    (microsoft.public.vb.general.discussion)
  • Re: RTF-to-ASCII Conversion
    ... > Oracle database), is formatting also saved in this string (i.e. bold, ... Your question contradicts itself, sort of. ... composed of nothing but ASCII characters, so I don't know exactly what you ...
    (microsoft.public.vb.general.discussion)
  • Man page viewing problems
    ... issues (or manpages on groff) ... Formatting page, please wait.../usr/bin/groff: can't find `DESC' file ... /usr/bin/groff:fatal error: invalid device `ascii' ...
    (freebsd-questions)
  • Re: Words default formatting for .TXT files
    ... From Webopedia: ... ASCII ... text files contain to formatting, so Word must supply all the ...
    (microsoft.public.word.conversions)