Discussion:
[Sanselan] - Help Parsing EXIF Info embedded in PNG
Stephen Nesbitt
2010-03-03 18:38:44 UTC
Permalink
All:

I am trying to figure out how to parse EXIF info embedded in a PNG.

I have a PNG for which the Sanselan.getMetadata call returns a single entry.
Executing getKeyword on this one entry returns "Raw profile type exif" while
getText() returns what appears to be a string of hex numbers. I also have
reason to believe that this metadata is stored in a zTxt block.

I have not been able to decode the text block - it doesn't seem to follow the
EXIF standards that I have found and searching for a known hex string (in this
case looking for the word Canon in hex) doesn't provide any matches.

So here are a few questions I hope someone can answer.
* In the case of a zTxt bloc does the getText() call return the raw compressed
contents or are the contents decompressed. In other words do I need to
uncompress the results of the getText() call and if so how?

* Is there a way ithat I can have Sanselan 1) parse this block and 2) output
all EXIF fields without having to perform a find on each potential EXIF field?

If Sanselan can't do this, is there another Java library out there that can?

Thanks in advance for the help

-steve
Christopher Schultz
2010-03-03 19:25:12 UTC
Permalink
Stephen,
Post by Stephen Nesbitt
I am trying to figure out how to parse EXIF info embedded in a PNG.
I'm no PNG/EXIF expert, but Wikipedia's EXIF page says that PNG doesn't
support EXIF:

"
The specification uses the existing JPEG, TIFF Rev. 6.0, and RIFF WAV
file formats, with the addition of specific metadata tags. It is not
supported in JPEG 2000, PNG, or GIF.
"
(http://en.wikipedia.org/wiki/Exif)

Maybe that's the problem?

- -chris
Stephen Nesbitt
2010-03-03 20:22:00 UTC
Permalink
Hi Chris:

I honestly don't think so. First of all if I use exiftool on one of my PNG I
do get a dump of EXIF info. And this corresponds to the documentation for
exiftool (http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/PNG.html)
which indicates that embedding EXIF is indeed not a PNG standard *but*
embedding arbitrary tag names is allowable and that one of the arbitrary but
standard tag names is "Raw profile type exif" and that it contains "raw" EXIF
info.

So I think the key questions is how do I parse the data and in particular is
the data returned by the zTxt.getText() method is compressed or uncompressed.

-steve
Post by Christopher Schultz
Stephen,
Post by Stephen Nesbitt
I am trying to figure out how to parse EXIF info embedded in a PNG.
I'm no PNG/EXIF expert, but Wikipedia's EXIF page says that PNG doesn't
"
The specification uses the existing JPEG, TIFF Rev. 6.0, and RIFF WAV
file formats, with the addition of specific metadata tags. It is not
supported in JPEG 2000, PNG, or GIF.
"
(http://en.wikipedia.org/wiki/Exif)
Maybe that's the problem?
-chris
---------------------------------------------------------------------
Charles Matthew Chen
2010-03-04 02:35:38 UTC
Permalink
First, calling getText() on Sanselan's zTXt chunk returns the
uncompressed contents:

https://svn.apache.org/repos/asf/commons/proper/sanselan/trunk/src/main/java/org/apache/sanselan/formats/png/chunks/PNGChunkzTXt.java

Next, I'm not sure exactly what is meant by "raw" Exif data, but
Exif is a binary format. It'd be strange to store it in a zTXt chunk.

Phil's exiftool documentation is in many ways the defacto standard
for image metadata practices, especially for these non-standardized
dark corners. In the absence of a standard, I'm not sure what you're
looking for or asking to be implemented.

What is the source of the images with this data? What software is
writing these PNGs?

Matthew


On Wed, Mar 3, 2010 at 3:22 PM, Stephen Nesbitt
Post by Stephen Nesbitt
I honestly don't think so. First of all if I use exiftool on one of my PNG I
do get a dump of EXIF info. And this corresponds to the documentation for
exiftool (http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/PNG.html)
which indicates that embedding EXIF is indeed not a PNG standard *but*
embedding arbitrary tag names is allowable and that one of the arbitrary but
standard tag names is "Raw profile type exif" and that it contains "raw" EXIF
info.
So I think the key questions is how do I parse the data and in particular is
the data returned by the zTxt.getText() method is compressed or uncompressed.
-steve
Post by Christopher Schultz
Stephen,
Post by Stephen Nesbitt
I am trying to figure out how to parse EXIF info embedded in a PNG.
I'm no PNG/EXIF expert, but Wikipedia's EXIF page says that PNG doesn't
"
The specification uses the existing JPEG, TIFF Rev. 6.0, and RIFF WAV
file formats, with the addition of specific metadata tags. It is not
supported in JPEG 2000, PNG, or GIF.
"
(http://en.wikipedia.org/wiki/Exif)
Maybe that's the problem?
-chris
---------------------------------------------------------------------
---------------------------------------------------------------------
Stephen Nesbitt
2010-03-04 02:49:16 UTC
Permalink
Post by Charles Matthew Chen
First, calling getText() on Sanselan's zTXt chunk returns the
Good - that is very nice to know. Am I correct in assuming that the text
returned is a set of hex values?
Post by Charles Matthew Chen
Next, I'm not sure exactly what is meant by "raw" Exif data, but
Exif is a binary format. It'd be strange to store it in a zTXt chunk.
Strange or not - that's what I think is happening.

The application that created the PNG is digikam and was the result of
digikam's convert from jpeg to png tool/capability. What I suspect is
happening is that digikam is taking the binary EXIF data and storing it in a
PNG zTXt chunk with a keyword of "Raw profile type"
Post by Charles Matthew Chen
Phil's exiftool documentation is in many ways the defacto standard
for image metadata practices, especially for these non-standardized
dark corners. In the absence of a standard, I'm not sure what you're
looking for or asking to be implemented.
Essentially I need a Java version of exiftool's capability to extract EXIF
info from a PNG. I am not sure that such a capability really falls within
Sanselan's responsibility.

Knowing that Sanselan is deflating the zTXt chunk as part of the getText()
method is very useful and helps illuminate the dark corners. It might be nice
to add a note to the Java doc to that point.

Thanks for the info!

-steve

Loading...