Re: funny blobs
- From: "slebetman@xxxxxxxxx" <slebetman@xxxxxxxxx>
- Date: Sun, 29 Jul 2007 22:03:31 -0700
On Jul 30, 11:45 am, "3rdshiftcoder" <g...@xxxxxxxx> wrote:
hi -
i think from code snippets on the web i put a ms word doc in
a db table. if i encode i get printable ascii. if i decode i get
funny symbols.
Thats because the "funny symbols" you see is a MS Word file. Try
opening your binWord.doc file in 'Notepad' and you'll likely see the
same funny symbols.
how do you read the data in the db table if it comes out binary
decoded?
You're already successful since you can already get the "funny
symbols". Your problem is not how to read since you can already do
that. Your problem is how to parse a MS Word file.
Now, this is no small matter since even Microsoft Word 6.0 will have
problems parsing a Microsoft Word 2001 file (yes, there is such an
application as Word 2001, hint: it's not for Windows). There are
plenty of efforts out there to reverse engineer the MS Word .doc
format. Everything from OpenOffice's attempt at feature and bug
compatibility to Google Doc's (formerly Writely) good-enough approach.
Unfortunately I don't know of any pure-tcl implementation of a
Word .doc parser.
Why are you using Word files in the first place? Can't you use simpler
to parse formats like plain .txt file or HTML (both of which can be
exported from Word documents)?
.
- Follow-Ups:
- Re: funny blobs
- From: 3rdshiftcoder
- Re: funny blobs
- References:
- funny blobs
- From: 3rdshiftcoder
- funny blobs
- Prev by Date: funny blobs
- Next by Date: Re: funny blobs
- Previous by thread: funny blobs
- Next by thread: Re: funny blobs
- Index(es):
Relevant Pages
|