Re: Word to text translation



On 2008-04-12, kenoli <kenoli.p@xxxxxxxxx> wrote:
Have you ever seen the gack that Word puts in its html files? They
are really xml files with all kinds of special definitions. I have
found a web site that will remove it all, one file at a time, which is
useful for cleaning up a file now and then. What I am trying to do is
find something that will let me batch upload files and let a php
script do the work. I have more material than I can handle one file
at a time.

Thanks,

--Kenoli

First of all, there are no reliable progs to convert Msword to text. The
binary info is based on proprietary formating commands, that can and have
changed from version to version. By and large, they are hit and miss.

You can search google for convertors and try them....but the result will be
less than you might like, antiword comes to mind.

However, if you have access to MSword you can try the following:

check the save as option and see if it has a save as text option, it had
this option at one time. If it does, write a quick down and dirty macro to
save a list of files as text files.

Check the file menu for various exporting options..get it into another
format then convert that format, maybe word => rtl => text or word => strict
html (this option should be there) => text

You might also try looking at hotscripts.com

ken

.



Relevant Pages

  • Re: Modify highlighter colors
    ... emphasis on modify - the highlightercolors in MSWord? ... Word MVP web site http://word.mvps.org ...
    (microsoft.public.word.docmanagement)
  • Re: MsWord 2005
    ... See Howard Kaikow's web site. ... > I've been using MsWord 2000 for years and am considering purchasing an ... does anyone know when MsWord 2005 is expected to be ...
    (microsoft.public.word.general)