Re: [PHP] Extracting text from PDF files



On 10/9/07, Jay Blanchard <jblanchard@xxxxxxxxxx> wrote:

[snip]
I need to extract the text from a PDF file for
storage in the database.
[/snip]

It depends. If the PDF is an image file you cannot do it with PHP.

http://www.php.net/pdf read the second user note


[snip]
Madison, WI 53703
[/snip]
P.S. Do you know of the Madison Scouts?


We use pdf2text.exe to extract text from PDF's. I don't know of anyway to do
it just using PHP.

David


Relevant Pages

  • RE: [PHP] Extracting text from PDF files
    ... I need to extract the text from a PDF file for ... If the PDF is an image file you cannot do it with PHP. ... P.S. Do you know of the Madison Scouts? ...
    (php.general)
  • Re: extract text from PDF file
    ... If the problem is a CIDFont, then you will get bigger garbage I'm ... I'm with Bugbear on this one, if you can't extract the 'text' from a PDF ... resulting from converting a PDF file. ...
    (comp.lang.postscript)
  • Re: Free computer?? :-)
    ... Available (as a .PDF File) from here... ... the termination date. ... Blah, blah, blah, for another 17 Sections/Paragraphs ...
    (uk.people.silversurfers)
  • Re: Does anyone know of a PDF-to-something else utility ?
    ... it's the site of the company that offers PDF-XChange and the Tools package ... that extracted the text and images from your pdf file. ... >>It would be a good test for the pdf program I brag about, PDF-XChange. ... > I can report that Don's software was able to extract the ascii in my ...
    (microsoft.public.windowsxp.general)
  • Re: Extract PDF content?
    ... Is there any gem or library which allows to extract text from a .PDF file?, any for Word or OpenOffice files? ...
    (comp.lang.ruby)