[reportlab-users] Patch to platypus/tables.py

Robin Becker robin at reportlab.com
Fri Feb 5 08:22:59 EST 2010


I apologize to everyone who received this 2Mb pdf. My colleague and I obviously
pressed the wrong button on the mailman control panel and instead of forwarding
to me it got accepted.

Tomasz next time (if there is one) please send the test script rather than the
PDF. That way we can actually test to see the performance gain.

Tomasz, can you send a suitable test script and data to me. There was a lot of
discussion about this and other table optimisations a while back. Were you using
the LongTable class etc etc?
--
Robin Becker


On 03/02/2010 15:57, Tomasz Świderski wrote:

> hello,

>

> I would like to apply patch to platypus/tables.py. Current code (2.4) is

> very slow with long tables. My table is about 480 pages long so there is

> a lot of splits during PDF generation. During each split 2 new Table

> instances are created: one with enough data to fit into current frame,

> second with rest of the data. It's OK with short tables, but in my case

> PDF generation take about 720seconds on my laptop. So I checked the code

> of tables.py and found a lot of unnecessary code in Table __init__

> method.

>

>

> 1) Each time new Table instance is created, data passed to __init__ is

> normalized (converted to plain string). It's makes sense when user is

> creating new Table, but during the table split it's unnecessary since

> new Tables are created using old Table data (which is already

> normalized). To fix this, I added new attribute to __init__ method

> called normalizedData. It defaults to False which means that data is not

> already normalized. If normalizedData is True (during the split) data

> conversion is skipped. This simple change gave me 12% performance boost.

>

> 2) Secondly during Table creation __init__ method created CellStyle

> object for each cell. It's unnecessary during split since new Tables use

> old Table's CellStyles. So I added cellStyles attribute to __init__ in

> order to reuse old Table's cell styles during split. After this change

> my PDF generation time dropped from 720s to 100s. 700% performance

> boost :D

>

>

> Of course performance gain strongly depends on number of cells and

> splits. In my case I have 9cols * 38rows * 480pages = 164160 cells. On

> each split all this data had to be normalized and CellStyle object

> created for each cell (number of cells decreases with each split). With

> very small Tables there will be no performance gains at all (low numer

> of cells and splits).

>

>

> I'm attaching my PDF and changes in diff format. Should be easy to apply

> them with patch command.

>

>

> Let me know if you like this patch and will include it in next reportlab

> release :)

>

>

> Best regards,

> Tomasz Świderski

> contact at tomaszswiderski.com



More information about the reportlab-users mailing list