Manipulating JPEG EXIF Headers

Category Web Development
Posted July 3, 2013
by Jacob Emerick

Reading and writing EXIF headers into JPEGs is not something that PHP offers a lot of help with. Outside of one helper function for reading, exif_read_data, which returns a wide variety of meta data about the image (including non-EXIF basic properties of the file), there is no easy way to read or edit them. To make matters worse one of the more accessible image libraries (PHP GD) drops all of the original EXIF headers. If you want to keep any information from the raw photo through to a GD-resized image you're going to need to do some work.

First I wanted to take a look at just how much information some of my raw photos had. The amount was a bit staggering.


Make: Canon
Model: Canon PowerShot G9
Orientation: 1
XResolution: 180/1
YResolution: 180/1
DateTime: 2013:05:18 06:45:22 (note the annoying lack of timezone here)
YCbCrPositioning: 1
Thumbnail: (raw photo data)
... and then like 40 more fields ...

The data I lost by using GD for resizing was depressing. Every single EXIF header was gone, as well as any IPTC or XMP that may have snuck in at some point. Not all of the metadata was important to me - like XResolution and YResolution - but there was enough to get me motivated to recover it. Also, I wanted to add some new fields, like copyright information and such. The first step was chunking up a raw image into separate pieces so I could view the raw headers and work with them.


// chunks up image to ease manipulation of segments
function chunk_image($filename)
{
  $image_array = array();
  
  $image_data = file_get_contents($filename);
  $image_data_length = strlen($image_data);
  
  // yes, we need to loop through the string one character at a time
  for($i = 0; $i < $image_data_length; $i += 2)
  {
    // characters within this range signify headers
    if(
      ord(substr($image_data, $i + 1, 1)) < 0xD0 ||
      ord(substr($image_data, $i + 1, 1)) > 0xD7)
    {
      // what kind of segment are we dealing with
      $segment_type = substr($image_data, $i + 3, 1);
      $segment_type = ord($segment_type);
      
      // how big the segment is, stored after the header
      $segment_size = substr($image_data, $i + 4, 2);
      $segment_size = unpack('n', $segment_size);
      $segment_size = array_pop($segment_size);
      
      // pull the data, housed after the size, using the length
      $segment_data = substr($image_data, $i + 6, $segment_size - 2);
      
      // store each header
      $image_array[] = array(
        'type' => $segment_type,
        'data' => $segment_data);
      
      // okay, we can advance the internal pointer
      $i += $segment_size;
      
      // oh, the last segment was the SOS, and the compressed image data is next
      if($segment_type == 0xDA)
      {
        // pull all of the image data to the end marker
        $end_of_image = strpos($image_data, "\xFF\xD9");
        $raw_image = substr($image_data, $i + 4, $end_of_image - ($i + 4));
        
        // not the best way to pass the image data, but it works
        $image_array[] = array(
          'type' => 'raw_image',
          'data' => $raw_image);
        
        break;
      }
    }
  }
  
  return $image_array;
}

If you give this function an acceptable jpeg filename it will chew through the raw bytes looking for segment headers, then parse those and pull meta about them. Each segment header defines a type of data, everything from generic comments to structured EXIF to Huffman optimization tables, and includes size information and the raw data afterwards. There's some assumptions here but I feel like everything is broken and readable enough for easy debugging.

Also, this is structured as a function for a reason. While most of the code examples on this blog are written as procedural code and the production code that I work on is exclusively object-orientated I felt this had to be in a function. It's very handy for running through the targeted image and doing the initial chunk as well as running through the original, pre-GD-resized image to pull original EXIF headers. You could also use it to test post-run.

Okay, once you have the initial headers it's time to get working on the EXIF. I defined my headers in an array and then just looped through it to create the segment. There are plenty of other headers that you can work with as listed here.


$new_exif_array = array(
  array(
    'type' => 270, // image description, could be title or alt
    'data' => 'Image Description'),
  array(
    'type' => 315, // your name, probably
    'data' => 'Image author'),
  array(
    'type' => 306, // datetime, which could be when it was taken
    'data' => date('Y:m:d H:i:s')),
  array(
    'type' => 33432, // generic copyright information
    'data' => 'Image Copyright'));

// let's build the exif segment based on our desired headers
$exif = '';
$exif .= 'Exif';
$exif .= "\x00\x00";
$exif .= 'MM'; // motorola encoding
$exif .= pack('n', 42); // tiff id
$exif .= pack('N', 8); // an inital offset
$exif .= pack('n', count($new_exif_array));

$segment_length = 2 + count($new_exif_array) * 12 + 4; // how long the ifd will be
$segment_head = '';
$segment_body = '';

foreach($new_exif_array as $row)
{
  $segment_head .= pack('n', $row['type']);
  $segment_head .= pack('n', 2); // ascii data type
  
  $data = $row['data'] . "\x00";
  $data = str_pad($data, 4, "\x00");
  
  $segment_head .= pack('N', strlen($data));
  
  // if the data is too long we need to append it at the end, not within the head
  if(strlen($data) > 4)
  {
    $offset = 8 + $segment_length + strlen($segment_body);
    $segment_head .= pack('N', $offset);
    $segment_body .= $data;
  }
  else
    $segment_head .= $data;
}

$exif .= $segment_head;
$exif .= pack('N', 0);
$exif .= $segment_body;

Here the assumptions start taking over. I use a Canon Powershot G9 for most of my photos, which saves jpegs using the Motorola spec, which uses specific binary encoding. If you are using a device that saves with the Intel spec then you'll have to change a lot of the 'pack' functions a bit. Also, all of the headers I want are ASCII data types. If you are working with other headers, like Orientation or Resolution, you'll have to modify the format a bit.

Well, we can pull segments now and create the EXIF segment… It's time to manipulate the image! In my case I wanted to remove two segments, the COM and APP0/JFIF (both are inserted by image gd, I think) and add a new one. If you wanted to remove other segments and/or add new ones be careful not to touch any of the internal pieces, like the optimization table or huffman stuff. Bad things happen if you touch those.


// okay, let's pull the current image data
$image_array = chunk_image('YOUR FILENAME HERE');

foreach($image_array as $key => $row)
{
  // we want to remove some header information, COM and APP0
  if($row['type'] == 0xE0 || $row['type'] == 0xFE)
    unset($image_array[$key]);
}

// prepend the APP1 (exif) onto the image info
array_unshift($image_array, array(
  'type' => 0xE1,
  'data' => $exif));

// okay, now it's time to put it all together
$new_image = "\xFF" . "\xD8";
foreach($image_array as $row)
{
  if($row['type'] == 'raw_image')
  {
    $compressed_image_data = $row['data'];
    continue;
  }
  
  $new_image .= sprintf("\xFF%c", $row['type']);
  $new_image .= pack('n', strlen($row['data']) + 2);
  $new_image .= $row['data'];
}

$new_image .= $compressed_image_data;
$new_image .= "\xFF" . "\xD9";

file_put_contents('YOUR FILENAME HERE', $new_image);

And that does it! If you need any help debugging or anything I'd recommend steering clear of exif_read_data(), as the data it returns is not 'pure' exif headers, and using other tools (there are Chrome extensions, a few (massive) PHP libraries, and probably countless others a simple Google search away).

So, now that this is all done, there are plenty of other pieces left just a few short hops away. I'd like to have a thumbnail embedded and some additional EXIF fields, as well as maybe some XMP and IPTC segments. None of these items are too difficult to add at this point, just have to reuse some of the code above and maybe abstract a few pieces out a bit. Also I'd like to get it to work with my image resizer, though I need to create a better workflow for that.

Comments (6)

Deniz Porsuk Sep 18, '14 Dear JacobWhile searching internet I reach to your post (thank you very much for that)It is working quite well.. But there is a small problem. When I add $new_exif_array = array( array( 'type' => 270, // image description, could be title or alt 'data' => 'Deniz Desc1'),in windows it shows dublicated.http://imgur.com/SI59l1ADo you have any solution for that.WRDeniz
- Jacob Emerick Sep 29, '14 Hi Deniz - my best guess is that Windows is reading the same field for both entries here. To verify you could use a different program for reading exif, something more robust, like ImageMagick.
Add to this discussion
- Name (required)
- Email (required, not displayed)
- Website
- Comment
- Basic HTML tags allowed (a, b, i, pre). Comments may be removed if they are deemed inappropriate.
- Email me when others comment on this post
- or
Arian Jul 19, '15 Hello Jacob, i have found this tutorial today and it is really good and helped me a lot, but i want to add the comment section that is shown in windows-explorer. I searched the web for hours and still cant add it, maybe you could help me ?I think the tag for the comment section is 40092(xpComment), but windows dont shows anything and on http://regex.info/exif.cgi xpComment exists, but it just shows this: 桷牥⁥牡⁥潹u. When i use windows to set the comment, the website schows the right text.Thanks,Arian
- Jacob Emerick Jul 23, '15 Hi Arian, ah, I honestly don't know. The best advice I can give you is to try inserting some unique text and then parsing the raw output to try to figure it out. I haven't tried using the exif-writing feature in Windows in a very long time.
Add to this discussion
- Name (required)
- Email (required, not displayed)
- Website
- Comment
- Basic HTML tags allowed (a, b, i, pre). Comments may be removed if they are deemed inappropriate.
- Email me when others comment on this post
- or
Chris Mar 22, '20 Thank you so much Jacob! I nearly searched endless for a solution to add Exif to raw JPEG images and with your solution it finally worked. Thanks again!Add to this discussion
- Name (required)
- Email (required, not displayed)
- Website
- Comment
- Basic HTML tags allowed (a, b, i, pre). Comments may be removed if they are deemed inappropriate.
- Email me when others comment on this post
- or
Josip Sep 29, '21 Hi Jacob! Your script is working perfectly, thank you very much for code and explanation!But I can not find the way how to insert GPS data to exif. Any idea?Add to this discussion
- Name (required)
- Email (required, not displayed)
- Website
- Comment
- Basic HTML tags allowed (a, b, i, pre). Comments may be removed if they are deemed inappropriate.
- Email me when others comment on this post
- or
- Name (required)
- Email (required, not displayed)
- Website
- Comment
- Basic HTML tags allowed (a, b, i, pre). Comments may be removed if they are deemed inappropriate.
- Email me when others comment on this post
- or

Jacob Emerick's Blog

Manipulating JPEG EXIF Headers

Related Posts

Comments (6)

Tag Cloud

Recent Comments

Some of my Sites

Activity Stream

Jacob Emerick's Blog

Manipulating JPEG EXIF Headers

Related Posts

Handling Images on My Sites

Resizing Images with PHP

Basic Color Masking with PHP

Simple Messaging in PHP

Comments (6)

Tag Cloud

Recent Comments

Some of my Sites

Activity Stream