Re: PDF PArser



Hi Mark,

You can do this with PDFtoolkit VCL. The Standard edition as well.
http://www.gnostice.com/pdftoolkit.asp

PDFtoolkit can also read/parse PDF version 1.5 and 1.6 (Acrobat 6 and 7) files.

Here's some code:

Read document properties
------------------------

gtPDFDocument.LoadFromFile('Input.pdf');
with gtPDFDocument.DocInfo do
begin
// Store into local variables
lTitle := Title;
lAuthor := Author;
lSubject := Subject;
lKeywords := Keywords;
lCreator := Creator;
end;

Write document properties
-------------------------

gtPDFDocument.LoadFromFile('Input.pdf');
with gtPDFDocument.DocInfo do
begin
// Write into document properties
Title := 'Document Title';
Author := 'Document Author';
Subject := 'Document Subject';
Keywords := 'PDF, Keywords';
Creator := 'The Creator';
end;
gtPDFDocument.SaveToFile('Output.pdf');

Let us know if you have further questions.

--
Girish Patil
Gnostice Information Technologies www.gnostice.com
---------------------------------------------------------------------
Gnostice eDocEngine (http://www.gnostice.com/edoc_engine.asp) -
Electronic document creation, Report Export, PDF eForms creation...

Gnostice PDFtoolkit (http://www.gnostice.com/pdftoolkit.asp) -
View, Print, Convert, Modify, Enhance PDF docs, process PDF eForms...
---------------------------------------------------------------------

"Mark Williams" <mark@{removethis}skwirel.com> wrote in message
news:43319f51$1@xxxxxxxxxxxxxxxxxxxxxxxxx
> Hi,
>
> I've been trying to write a function that extracts document properties from a
> pdf eg author, title, pagecount etc.
>
> I take a few steps forward followed by many steps back and I've now become too
> frustrated with it.
>
> Has anyone managed to do this reliably or knows of any third party components
> that can do this. I don't want a full blown pdf component set, just a
> lightweight component or code sample that does just this.
>
> Thanks in advance.
>
> Mark Williams
>


.



Relevant Pages

  • Assigning and Using Keywords on Docs
    ... I do quite a lot of online research on economic policy ... and have a number of PDF and Word documents downloaded to ... to document properties. ... Keywords, however, I realized there was no apparent way ...
    (microsoft.public.windowsxp.general)
  • Re: Missing index
    ... Perhaps you never attached an index to a PDF. ... separate from making the index. ... Somewhere in document properties. ... Aandi Inston quite@xxxxxxxxxxxxxx http://www.quite.com ...
    (comp.text.pdf)
  • Re: Document Title Uploading
    ... I saw PDF; noted it in my brain and then immediately ... transfer)appearing quickly for PDF files is not great. ... for the lesser Office products either come to that. ... >> the Title field of the document Properties in Word documents. ...
    (microsoft.public.sharepoint.windowsservices)
  • Re: Palatino font question
    ... Bob Tennent wrote: ... > PDF. ... > No problems with mathptmx, but it uses Times for the serif font, not ... Use pdffonts or Document Properties in the Adobe Reader ...
    (comp.text.tex)
  • Re: Palatino font question
    ... PDF. ... I've viewed it in Kpdf and Acrobat Reader. ... No problems with mathptmx, but it uses Times for the serif font, not ... Use pdffonts or Document Properties in the Adobe Reader ...
    (comp.text.tex)