Encoding problem with automation of Word by Perl



In the code snippet below @foutput is an array of 'paragraphs': which are
arrays of 'text items': which are arrays with two elements, the first being
a text string and the second another text string which holds formatting
information (currently 'b' for bold 's' for superscript and '' for normal).

The code works in than it opens a Word document and produces formatted text
therein. The problem is that non-ascii Unicode characters do not transfer
cleanly. I expect that Perl and Word are making different assumptions about
what encoding is in use (Word seems to be recieving utf-8 but interpreting
it as 'code page 1252') but I don't know how change it.

Here is the code snippet (use strict and use warnings are in operation at
the top of the file) :

elsif ($opt{w}) {
use Win32::OLE;
my $word = CreateObject Win32::OLE 'Word.Application' or die $!;
$word->{'Visible'} = 1;
my $document = $word->Documents->Add;
my $selection = $word->Selection;
my $i = 0;
foreach my $para (@foutput) {
$i++; last if $i == 5; # just a few for debugging
foreach (@{$para}) {
if (@{$_}[1] eq "") {
$selection->TypeText(@{$_}[0]);
}
elsif (@{$_}[1] eq "b") {
$selection->Font->{Bold} = 1;
$selection->TypeText(@{$_}[0]);
$selection->Font->{Bold} = 0;
}
elsif (@{$_}[1] eq "s") {
$selection->Font->{Superscript} = 1;
$selection->TypeText(@{$_}[0]);
$selection->Font->{Superscript} = 0;
}
else {
die "Unknown formatting: " . @{$_}[1];
}
}
$selection -> TypeParagraph;
}


.



Relevant Pages

  • Re: Hot to parse RTF String
    ... Define styles in the Word document template containing the formatting ... Use VBA in Word to retrieve the strings from the database. ... automatically apply the associated style to each string as it brings it in. ... Word document, but parsed, that is formatted. ...
    (microsoft.public.word.docmanagement)
  • Re: New Python 3.0 string formatting - really necessary?
    ... When working on resource-oriented, generalized display applications ... ways - and on the web, display means "convert to a string for the ... the template knows how I ... I can solve this using the new string formatting. ...
    (comp.lang.python)
  • Re: Pulling from Access Query, Cannot Change Date/Time Formats
    ... With them you can specify DataFormatString to apply string ... With string formatting you get complete control over how date or ... Sub doColor ... Dim i9, i10 As Integer ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Track Changes VBA Granular information needed
    ... If the second string occurs somewhere in the first ... Bold, Italic". ... If you're keeping separate track of bold formatting and italic ... extract more specific changes. ...
    (microsoft.public.word.vba.beginners)
  • Re: Calculating Time Sheet
    ... show the time duration, but only if less than 24 hours. ... summing them and formatting the result as time will ... Public Function TimeElapsedAs String ... In query for instance you could have a computed column which calls the ...
    (microsoft.public.access.gettingstarted)