[reportlab-team] [reportlab-users] reportlab and CMYK images, take 2

Oliver Bleutgen reportlab-users@reportlab.com
Sat, 04 Oct 2003 17:10:17 +0200


Thanks for taking the time to answer.

Andy Robinson wrote:

>>Is it desirable and would it make sense to beef up reportlab's image 
>>capabilities? Are there any thoughts into what direction one should go 
>>with this?
> 
> 
> None of us really know anything about color spaces and images
> but I would welcome a contribution here. When I first tried
> to get images working again, I just figured out how to encode
> the RGB pixels in a legal manner in the PDF spec, and never
> bothered with other formats. >
> It would be great to support CMYK in images and possibly useful for 
> us too.  Please feel free to send in patches.
> 
> I think there must be some degree of normlisation or inspection
> of images.  A GIF uses a palette and it would be complex to
> encode the same palette in PDF, so a one-liner to nromalize them
> as RGB seemed sensible.  So at the API level I suggest you add
> some keyword parameters to indicate desired or given color space
> to drawImage/drawInlineImage.  i.e. if you tell it you have
> a CMYK image and you want a CMYK image, there is no conversion;
> if you tell it you want CMYK and you give it something else,
> it either converts or complains noisily (I don't mind which).
> We should also check in a sample CMYK image or two into
> the distribution.

Ok, I've done some research and testing, and I think I'd like to do the
following:

- rely on PIL as much as possible. That means for instance not to use
pdfutils.readJPEGInfo().
When testing the current code, I found out that some CMYK jpgs
would come out inverted in the pdf. Funnily this is a well know Acrobat
quirk since a long time, and even documented by the JPEG group, and the 
PIL guys have a workaround for that, but readJPEGInfo isn't capable of 
extracting the needed information.

- use PIL to analyze a given image to find out what format it has (i.e.
jpg, tif, gif, etc.), and code spezialized "converter" methods, where 
possible, which return, uhm, something.
At first I think I'll settle with the tuple (imagedata, imgwidth, 
imgheight), like jpg_imagedata() and PIL_imagedata are doing now. But I 
feel that should be more generalized, i.e. the pdf object header should 
be built later in the process.
In that context, what is PDFImage.format() about?

- In addition to the specialized "converter" methods, offer a generic 
converter method, which uses PIL to convert the image to "raw" format, 
but preserves the seperation (i.e. never convert from CMYK->RGB or RGB 
->CMYK). This conflicts with what you wrote above, but I don't think 
it's wise to offer that conversion.
RGB->CMYK is  dependend on the output medium, and Acrobat Reader is
capable of displaying CMYK images, so I don't see the need for conversion.

- Mid term goal: Unify image XObjects and inline images.


cheers,
oliver