Re: Word to text translation
- From: Preventer of Work <not_this@xxxxxxxxxx>
- Date: Sat, 12 Apr 2008 04:39:12 GMT
kenoli wrote:
Have you ever seen the gack that Word puts in its html files? They
are really xml files with all kinds of special definitions. I have
found a web site that will remove it all, one file at a time, which is
useful for cleaning up a file now and then. What I am trying to do is
find something that will let me batch upload files and let a php
script do the work. I have more material than I can handle one file
at a time.
Thanks,
--Kenoli
On Apr 11, 7:19 pm, Preventer of Work <not_t...@xxxxxxxxxx> wrote:kenoli wrote:Does anyone know a class or other script for translating the contentsDon't know of anything that does that directly.
of a MSWord document into a text file with simple formatting, e.g.
paragraph breaks, not totally mangling lists, etc. so it can be stored
in a text field in a mysql database.
The point of this is storing data from documents so that selections
can be cut and pasted into another database where it will be utilized
as text content in a database driven web site.
I realize that one way to do this is to simply link to the actual
MSWord file located in a directory. Putting it into a database field,
however, would be useful as I don't care about the formatting, aside
from keeping it readable. Having it in this form makes it possible to
easily copy and paste stuff from fields in the one database to fields
in the database driving the web site.
Thanks,
--Kenoli
You could export them from Word as html files - it is at least text, and
there are parsers for html.
The MS Visual Studio langauges come with Word APIs. You can search, extract text, stuff like that (I've not used them, but do know they exist). You could write a program that pulls out all the text from as many files as you want at one time.
You can also do that with OpenOffice.org on any platform. You can have a program tell it to open and import Word files, then pull content out - same as VS/Word operations.
http://api.openoffice.org/
I know this isn't what you wanted, but maybe someone else will remember seeing something based on these. Such a tool should be handy to lots of people,
.
- References:
- Word to text translation
- From: kenoli
- Re: Word to text translation
- From: Preventer of Work
- Re: Word to text translation
- From: kenoli
- Word to text translation
- Prev by Date: Re: Word to text translation
- Next by Date: New website written in PHP, MySQL
- Previous by thread: Re: Word to text translation
- Next by thread: Re: Word to text translation
- Index(es):
Relevant Pages
|