[reportlab-users] How to read and edit an existing PDF file

Tim Roberts timr at probo.com
Fri Jan 27 13:26:33 EST 2006


On Thu, 26 Jan 2006 14:21:54 -0500, Gregory Pi?ero
<gregpinero at gmail.com> wrote:

>Hi guys,
>
>cool library.  I didn't see in the documentation how I can read in a
>PDF file and just change a few words in it.  Is there a simple way to
>do that?
>


If the PDF is not encrypted, then this is not so hard.  The only issue
is that most PDFs are compressed, so that all you see in the PDF is a
big base64-encoded block of garbage.  Fortunately, that's easy to work
around.

Go fetch pdftk, a toolkit that no PDF jockey should be without.  Use
"pdftk incoming.pdf output outgo.pdf uncompress".  Now outgo.pdf will
contain ordinary Postscript code.  Bring that up into your favorite
editor, and you should be able to find the words you want to change. 
You can then recompress it, if you want to.

Note that the Postscript code places each string (or substring, or
letter, in some cases) individually.  This kind of tweaking is only
possible if your replacement is about the same size as the original.  If
you want to replace "1976" with "2012", that will work fine.  If you
want to replace "Clinton" with "President George Walker Bush", that
won't work.  The new text will overlap other stuff.

There are some incredible tools used in the web press business that they
call "pre-flight" tools.  They do some incredible PDF manipulations. 
They're almost like a PDF-based word processor.  However, they are
fabulously expensive.

-- 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.



More information about the reportlab-users mailing list