Re: Fw: PDF library for reading PDF files

From: Cameron Laird (claird_at_lairds.com)
Date: 01/20/04


Date: Tue, 20 Jan 2004 15:32:48 -0000

In article <400CF2E3.29506EAE@netsurf.de>,
Andreas Lobinger <andreas.lobinger@netsurf.de> wrote:
>Aloha,
>
>Peter Galfi schrieb:
                        .
                        .
                        .
>> having to implement all the decompressions, etc. The "information" I am
>> trying to extract from the PDF file is the text, specifically in a way to
>> keep the original paragraphs of the text. I have seen so far one shareware
                        .
                        .
                        .
>As others wrote here, the simplest solution is to use a external
>pdf-2-text programm and postprocess the data. Read comp.text.pdf
>
>There is no simple and consistent way to extract text from a .pdf
>because there are many ways to set text. The optical impression
                        .
                        .
                        .
I want to emphasize that final sentence. If you insist on pursuing
this, though, refer to <URL:
http://phaseit.net/claird/comp.text.pdf/PDF_converters.html#pdf2txt >.

-- 
Cameron Laird <claird@phaseit.net>
Business:  http://www.Phaseit.net


Relevant Pages

  • Re: extract text from PDF file
    ... If the problem is a CIDFont, then you will get bigger garbage I'm ... I'm with Bugbear on this one, if you can't extract the 'text' from a PDF ... resulting from converting a PDF file. ...
    (comp.lang.postscript)
  • Re: Does anyone know of a PDF-to-something else utility ?
    ... it's the site of the company that offers PDF-XChange and the Tools package ... that extracted the text and images from your pdf file. ... >>It would be a good test for the pdf program I brag about, PDF-XChange. ... > I can report that Don's software was able to extract the ascii in my ...
    (microsoft.public.windowsxp.general)
  • Re: Extract PDF content?
    ... Is there any gem or library which allows to extract text from a .PDF file?, any for Word or OpenOffice files? ...
    (comp.lang.ruby)
  • Re: Extracting comments from PDF files
    ... I mean that part of the PDF file ... where you see the hand like symbol and the note nearby popping up like ... I could use XML::Writer provided I can extract the ... Prev by Date: ...
    (perl.beginners)