Re: Identifying File type by reading files

From: Andrew Dalke (adalke_at_mindspring.com)
Date: 12/26/03


Date: Fri, 26 Dec 2003 19:46:08 GMT

hokiegal99:
> what should I look for in a file to determine whether or not it is a
> MS Word file or an Excel file or a PDF file, etc., etc.? Below is a
> list of some of the strings I use to ID files, but I can't help but
> wonder that there must be a more precise way of doing this. I know of
> the Unix 'file' command. It is not very useful for me as it doesn't
> distinguish between MS Office documents... all .xls, .docs, .ppts are
> MS documents to it.

That likely means you have an incomplete 'magic' file. This is the
file used by the 'file' command to figure out the file type. Take a
look at http://www.unixhideout.com/freebsd/share/misc/magic for
a more complete (I think) version.

That's dated 1995 and is close the one on my Mac. It doesn't support
the newer MS Word and Excel formats. I'm having trouble
finding the most recent, definitive version. One link pointed me
to ftp://ftp.astron.com/pub/file/ but I haven't investigated it further.

There's also a pymagic, http://thomas.mangin.me.uk/software/python.html
which may help for a pure Python implementation of 'file'.

                    Andrew
                    dalke@dalkescientific.com



Relevant Pages

  • Re: Docs2Go v9
    ... manually with Filez and deleting as appropriate. ... Formatting is retained, which is nice, and one Excel file ... that would never load under previous versions now does. ... A Word file that has graphics embedded in it ...
    (comp.sys.palmtops.pilot)
  • Re: Form Letter In Word / Data Source in Excel ... cannot edit rec
    ... I used to have a word-excel mail merge file working together for months. ... I click the word file, the excel file opens automatically. ...
    (microsoft.public.word.mailmerge.fields)
  • Re: MDE Licensing
    ... No more than distributing an Excel file or Word file does. ... Just as you can't make an executable from a Word file or Excel file. ... Rick Brandt, Microsoft Access MVP ...
    (comp.databases.ms-access)
  • Re: Excel linked to Word - Office 2003
    ... linked sheet and forces an update first. ... the Excel file, but others may use the Word file. ... see the Word file as it was before Excel was changed. ... With the Word file still open, open the Excel file and force Word ...
    (microsoft.public.word.docmanagement)
  • Re: MS Internet control and Excel
    ... and a command button called Command1. ... I type in the path for the excel file in the text ... I then close the test program, and in some machines excel is still in the ... >> I am using MS Internet control in my VB 6 program. ...
    (microsoft.public.vb.general.discussion)