[reportlab-users] How can I edit PDF metadatas

Patrick Maupin pmaupin at gmail.com
Mon Nov 30 13:11:14 EST 2009


I've added some new functionality, and an example that is more in line
with the original request (to alter the title metadata in a
preexisting PDF). The required code is basically:

trailer = PdfReader(inpfn)
trailer.Info.Title = 'My New Title Goes Here'
writer = PdfWriter()
writer.trailer = trailer
writer.write(outfn)

The full example is at:

http://code.google.com/p/pdfrw/source/browse/trunk/examples/alter.py

Regards,
Pat

On Mon, Nov 30, 2009 at 11:33 AM, Patrick Maupin <pmaupin at gmail.com> wrote:

> I second the thought that pdftk is easy to use (I use it all the

> time), but if your source files are not encrypted and don't have PDF

> 1.5 compressed object streams, my new pdfrw library might be easier to

> use than pypdf.  For example, the way you can add metadata to the PDF

> trailer dictionary is:

>

> writer.trailer.Info = IndirectPdfDict(

>    Title = 'your title goes here',

>    Author = 'your name goes here',

>    Subject = 'what is it all about?',

>    Creator = 'some script goes here',

> )

>

> I have posted a complete working example:

>

> http://code.google.com/p/pdfrw/source/browse/trunk/examples/metadata.py

>

> Best regards,

> Pat

>

>

> On Mon, Nov 30, 2009 at 5:49 AM, Christian Jacobsen

> <cljacobsen at gmail.com> wrote:

>> You can also use pypdf. http://pybrary.net/pyPdf/

>>

>> This won't let you edit the metadata per se, but will let you read one

>> or more pdf file(s) and spit them back out, possibly with new

>> metadata. It is somewhat low level though. I find it useful for

>> automatically putting PDFs together for various things. I have

>> included an example below. I often also generate pages with replortlab

>> (ie separators between the PDFs that I am concatenating) which I

>> insert into the stream. I use the stringio module to capture the page

>> from reportlab and then feed it to pypdf's PdfFileReader. I have also

>> been known to stamp a running page number onto the pages by merging a

>> reportlab generated page with just a pagenumber with the input page.

>>

>> pdftk does all of this of course and is probably easier to use.

>>

>>  Christian

>>

>>

>> import pyPdf

>> from pyPdf import PdfFileWriter, PdfFileReader

>>

>> OUTPUT = 'output.pdf'

>> INPUTS  = ['test1.pdf', 'test2.pdf', 'test3.pdf']

>>

>> # There is no interface through pyPDF with which to set this other then getting

>> # your hands dirty like so:

>> infoDict = output._info.getObject()

>> infoDict.update({

>>    NameObject('/Title'): createStringObject(u'title'),

>>    NameObject('/Author'): createStringObject(u'author'),

>>    NameObject('/Subject'): createStringObject(u'subject'),

>>    NameObject('/Creator'): createStringObject(u'a script')

>>    })

>>

>> inputs = [PdfFileReader(i) for i in INPUTS]

>> for input in inputs:

>>    for page in range(input.getNumPages()):

>>        output.addPage(input.getPage(page))

>>

>> outputStream = file(OUTPUT, 'wb')

>> output.write(outputStream)

>> outputStream.close()

>>

>>

>> 2009/11/26 Andy Robinson <andy at reportlab.com>:

>>> 2009/11/26 Dani Reguera <drbakhache at gmail.com>:

>>>> Can I open the file with reportlab and then set its title?

>>>

>>> reportlab creates files, but doesn't edit them.

>>>

>>> pdftk is writing the final file. I just googled it, and it has options

>>> to set the metadata on the command line.  See 'update_info' on this

>>> page...

>>>    http://www.accesspdf.com/pdftk/

>>>

>>> --

>>> Andy Robinson

>>> CEO/Chief Architect

>>> ReportLab Europe Ltd.

>>> _______________________________________________

>>> reportlab-users mailing list

>>> reportlab-users at lists2.reportlab.com

>>> http://two.pairlist.net/mailman/listinfo/reportlab-users

>>>

>> _______________________________________________

>> reportlab-users mailing list

>> reportlab-users at lists2.reportlab.com

>> http://two.pairlist.net/mailman/listinfo/reportlab-users

>>

>



More information about the reportlab-users mailing list