Re: funny blobs
- From: "3rdshiftcoder" <go@xxxxxxxx>
- Date: Mon, 30 Jul 2007 01:28:01 -0400
"slebetman@xxxxxxxxx" <slebetman@xxxxxxxxx> wrote in message
news:1185771811.718066.106070@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
You're already successful since you can already get the "funny
symbols". Your problem is not how to read since you can already do
that. Your problem is how to parse a MS Word file.
Now, this is no small matter since even Microsoft Word 6.0 will have
problems parsing a Microsoft Word 2001 file (yes, there is such an
application as Word 2001, hint: it's not for Windows). There are
plenty of efforts out there to reverse engineer the MS Word .doc
format. Everything from OpenOffice's attempt at feature and bug
compatibility to Google Doc's (formerly Writely) good-enough approach.
Unfortunately I don't know of any pure-tcl implementation of a
Word .doc parser.
Why are you using Word files in the first place? Can't you use simpler
to parse formats like plain .txt file or HTML (both of which can be
exported from Word documents)?
hi slebetman -
i kind of achieved what i wanted tonight. i know what you mean by parsing.
i wanted to use a word file as i knew it was binary. i wanted to know can
i stick a binary file (word) into the sqlite db in a table and then extract
it with sql
and look at the file contents. for a while i could just see either the
encoded or
decoded print out of the file when i was fooling around. so i wondered
did i actually do it? i couldn't see the text typed originally into the
word file.
just gibberish (too me anyways). finally when i output the binary data
correctly to a file i could see that it worked.
you could kind of see that i read it. but i couldnt.
i really dont need to parse it. that is really good advice though on
choosing
the right format to parse.
thanks very much.
have a good week :-)
jim
.
- References:
- funny blobs
- From: 3rdshiftcoder
- Re: funny blobs
- From: slebetman@xxxxxxxxx
- funny blobs
- Prev by Date: Re: funny blobs
- Next by Date: Quick way to convert Integers to Binary in TCL
- Previous by thread: Re: funny blobs
- Next by thread: Re: funny blobs
- Index(es):
Relevant Pages
|