[reportlab-users] RE: Creating Large PDF Files

Engel, Gregory reportlab-users@reportlab.com
Thu, 6 Mar 2003 08:10:29 -0700


Unfortunately, the data can change significantly from page to page.  =
What the application does is takes raw data that is normally spools to =
huge greenbar printers (no paper trays here) and attempts to make a PDF. =
 Most often, this data contains discreet reports that are 1-10 MB in =
size.  Occasionally, a single report can be 500+ MB and contain 90,000 =
pages.  Our clients are fortune 100 companies with millions of =
customers.  These reports are related to services provided to these =
customers.  While I had originally proposed chopping the data up into =
"chapters", the customers were not interested.  What they want is the =
ability to view on screen what they normally have shipped to them in =
paper form on several pallets each month.  Saving trees, yes.  But they =
were not to answers who would actually need to view this much data at =
once.  Since they are paying the bills, the point was not pressed.

I don't believe we have much say in the box used to make this happen.  =
We pressed hard for a dual CPU Sun box with 3 GB of RAM.  If purchasing =
had their way, we would be developing this on Palm m100's.

I may not understand the idea of PDF "forms" so shall look into that =
closer and see if there is some way to perhaps squeeze more data into =
memory and push the envelop out a bit.

Many thanks for your efforts toward a solution.  I shall continue =
working here as well.  If a solution can be found, I shall post it.


-----Original Message-----

(Sorry, hit the wrong key a moment ago and sent a blank reply).

Currently we do always process in memory.  PDF files are full of cross
references and it is impossible to write the file without knowing
everything about it; it's not like HTML or text where you can write the
beginning, the middle and then the end :-(.  So, we decided early
on to create the whole thing in memory.  Another approach would have
been to write each page to disk in a special format and assemble them
at the end, but we have not implemented that.  We also decided
that it should be possible to render, say, a 1000-page book
on a well-specced PC, and that was good enough.

The first thing to ask is, are you making efficient use of forms
for any content which is common across many pages?
I've seen repetitive apps (ten thousand customer statements)
which could be done in 5Mb one way or 500Mb another way.

The next thing is, is it cheaper to buy a small fistful of
memory chips than to write code to

The third is, if your app is that big, would it not be easier
for everyone to split it into 'chapters'?  It's generally bad to
hit print on a document more than 500 pages long as you are almost
certain to have to change per trays in mid-printing?

These are all 'excuses' and not solutions but worth asking your
manager about.  If it's really critical, get back to me and I may
be able to outline a solution, but it will need some work...

- Andy