Parsing Google Calendar - Recurring Events

Dealing with normal events in a Google Calendar feed is not too tough. Looping through each event and pulling the basic chunks of information (title, description, times) is easy once you hook up the correct namespacing to the nodes. Once you get into recurring events, though, things get more difficult. Even Google seems to take the easy way out here and just throw all the repeating rules in iCal format, a solid blob of text in one of the XML elements. They leave it up to the developer to plod through the iCal, pull out the rules, and explode out each instance of the event.

So, first off, you need to pull out the data. Working off the groundwork in my earlier post, some modifications will need to be made to the loop through each entry. There is no start_time or end_time nodes for recurring events, so we should also account for that missing piece of data.

  1. $events = array();

  2. foreach($feed_xml->entry as $event)

  3. {

  4. $gd_nodes = $event->children('http://schemas.google.com/g/2005'); // found in attribute of feed node

  5. $gcal_nodes = $event->children('http://schemas.google.com/gCal/2005'); // found in attribute of feed node

  6. $id = (string) $gcal_nodes->uid->attributes()->value;

  7. $title = (string) $event->title;

  8. $description = (string) $event->content;

  9. $location = (string) $gd_nodes->where->attributes()->valueString;

  10. /* modified code */

  11. if(isset($gd_nodes->recurrence))

  12. {

  13. $ical = (string) $gd_nodes->recurrence;

  14. $ical = explode("\n", $ical);

  15. $ical = array_slice($ical, 0, 3);

  16. $start_time = explode(':', $ical[0]);

  17. $start_time = array_pop($start_time);

  18. $end_time = explode(':', $ical[1]);

  19. $end_time = array_pop($end_time);

  20. $repeating_rule = $ical[2];

  21. }

  22. else

  23. {

  24. $start_time = (string) $gd_nodes->when->attributes()->startTime;

  25. $end_time = (string) $gd_nodes->when->attributes()->endTime;

  26. }

  27. /* end modified piece */

  28. }

There are some annoying pieces here. iCal can be kind of messy, delimited with line breaks, colons, semicolons, and equal signs in different places, though every Google Calendar example has seemed fairly standard. You will need the first three rows: one for start_time, one for end_time, and one for the repeating rule. Also, since iCal defines timezones differently than the nodes in the feed, timezones will not match up. You may need to grab the timezone from the first two rows to make sure all your dates are lined up. Below is an example of what we're dealing with in the gd:recurrence node.

  1. DTSTART;TZID=America/Chicago:20130610T080000

  2. DTEND;TZID=America/Chicago:20130610T090000

  3. RRULE:FREQ=DAILY;COUNT=5

  4. BEGIN:VTIMEZONE

  5. TZID:America/Chicago

  6. X-LIC-LOCATION:America/Chicago

  7. BEGIN:DAYLIGHT

  8. TZOFFSETFROM:-0600

  9. TZOFFSETTO:-0500

  10. TZNAME:CDT

  11. DTSTART:19700308T020000

  12. RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU

  13. END:DAYLIGHT

  14. BEGIN:STANDARD

  15. TZOFFSETFROM:-0500

  16. TZOFFSETTO:-0600

  17. TZNAME:CST

  18. DTSTART:19701101T020000

  19. RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU

  20. END:STANDARD

  21. END:VTIMEZONE

Again, the only important lines are the first three - the rest are for timezones and daylight savings time and whatnot, which hopefully you won't need to worry about (hopefully).

So this is great, but what about getting a different instance for each part of a repeating event? All that the above script does is grab the start and end time of the root occasion. There are a lot of repeating options that Google Calendar has, each with their own special quirks, and we need to worry about each one.

  1. Google Calendar Repeating Option

  2. Daily

  3. - every n days

  4. - start date

  5. - ends never, after n occurrences, or on a certain date

  6. - example: RRULE:FREQ=DAILY;COUNT=5

  7. Weekday

  8. - start date

  9. - ends never, after n occurrences, or on a certain date

  10. - example: RRULE:FREQ=WEEKLY;UNTIL=20130715T093000Z;BYDAY=MO,TU,WE,TH,FR

  11. Monday, Wednesday, and Friday

  12. - start date

  13. - ends never, after n occurrences, or on a certain date

  14. - example: RRULE:FREQ=WEEKLY;BYDAY=MO,WE,FR

  15. Tuesday, Thursday

  16. - start date

  17. - ends never, after n occurrences, or on a certain date

  18. - example: RRULE:FREQ=WEEKLY;COUNT=25;BYDAY=TU,TH

  19. Weekly

  20. - every n weeks

  21. - multiple days per week

  22. - start date

  23. - ends never, after n occurrences, or on a certain date

  24. - example: RRULE:FREQ=WEEKLY;UNTIL=20130716T093000Z;BYDAY=SU,MO,TU

  25. Monthly

  26. - every n months

  27. - day of month v day of week

  28. - start date

  29. - ends never, after n occurrences, or on a certain date

  30. - example: RRULE:FREQ=MONTHLY;INTERVAL=3;BYMONTHDAY=11

  31. Yearly

  32. - every n years

  33. - start date

  34. - ends never, after n occurrences, or on a certain date

  35. - example: RRULE:FREQ=YEARLY;UNTIL=20180611T093000Z

Even though the number of options available are a bit overwhelming you can start to see some patterns come out. The same key names are used for similar fields across the different options and some of the weekly repeating options use the same exact format.

So, to deal with these, we first need to parse out the fields. After the different variables are in a format that is easier to work with there are several ways to 'explode' each event, especially since the options are similar across them. Instead of being all tricksy and abstract we could just handle each option separately. Let's do that.

  1. $events = array();

  2. foreach($feed_xml->entry as $event)

  3. {

  4. $gd_nodes = $event->children('http://schemas.google.com/g/2005'); // found in attribute of feed node

  5. $gcal_nodes = $event->children('http://schemas.google.com/gCal/2005'); // found in attribute of feed node

  6. $id = (string) $gcal_nodes->uid->attributes()->value;

  7. $title = (string) $event->title;

  8. $description = (string) $event->content;

  9. $location = (string) $gd_nodes->where->attributes()->valueString;

  10. $repeat_array = array();

  11. if(isset($gd_nodes->recurrence))

  12. {

  13. $ical = (string) $gd_nodes->recurrence;

  14. $ical = explode("\n", $ical);

  15. $ical = array_slice($ical, 0, 3);

  16. $start_time = explode(':', $ical[0]);

  17. $start_time = array_pop($start_time);

  18. $end_time = explode(':', $ical[1]);

  19. $end_time = array_pop($end_time);

  20. $repeating_rule = explode(':', $ical[2]);

  21. $repeating_rule = array_pop($repeating_rule);

  22. $repeating_rule = explode(';', $repeating_rule);

  23. $repeat_array = array();

  24. foreach($repeating_rule as $row)

  25. {

  26. list($key, $value) = explode('=', $row);

  27. if($key == 'BYDAY')

  28. $value = explode(',', $value);

  29. $repeat_array[$key] = $value;

  30. }

  31. }

  32. else

  33. {

  34. $start_time = (string) $gd_nodes->when->attributes()->startTime;

  35. $end_time = (string) $gd_nodes->when->attributes()->endTime;

  36. }

  37. /* start modified code */

  38. $events[] = array(

  39. 'id' => $id,

  40. 'title' => $title,

  41. 'description' => $description,

  42. 'location' => $location,

  43. 'start_time' => strtotime($start_time),

  44. 'end_time' => strtotime($end_time));

  45. if(!empty($repeat_array))

  46. {

  47. if(isset($repeat_array['UNTIL']))

  48. $limit = array('UNTIL' => $repeat_array['UNTIL']);

  49. else if(isset($repeat_array['COUNT']))

  50. $limit = array('COUNT' => $repeat_array['COUNT']);

  51. else

  52. $limit = array('COUNT' => $repeating_event_limit); // too avoid infinite arrays

  53. $timestamp = strtotime($start_time);

  54. $elapsed_time = strtotime($end_time) - $timestamp;

  55. $count = 1;

  56. while(TRUE) // with multiple ending conditions I found this the best way to loop

  57. {

  58. switch($repeat_array['FREQ'])

  59. {

  60. case 'DAILY' :

  61. $interval = isset($repeat_array['INTERVAL']) ? $repeat_array['INTERVAL'] : 1;

  62. $timestamp += 24 * 60 * 60 * $interval;

  63. break;

  64. case 'WEEKLY' :

  65. unset($next_day);

  66. $day = date('w', $timestamp);

  67. foreach($repeat_array['BYDAY'] as $repeat_day)

  68. {

  69. $repeat_day_index = array_search($repeat_day, $weekday_short_array);

  70. if($repeat_day_index > $day)

  71. {

  72. $next_day = $repeat_day_index;

  73. break;

  74. }

  75. }

  76. if(isset($next_day))

  77. $timestamp += 24 * 60 * 60 * ($next_day - $day);

  78. else

  79. {

  80. $next_day = array_search($repeat_array['BYDAY'][0], $weekday_short_array);

  81. $timestamp += 24 * 60 * 60 * ($next_day + 7 - $day);

  82. }

  83. break;

  84. case 'MONTHLY' :

  85. $interval = isset($repeat_array['INTERVAL']) ? $repeat_array['INTERVAL'] : 1;

  86. if(isset($repeat_array['BYMONTHDAY']))

  87. $timestamp = strtotime(date('c', $timestamp) . " +{$interval} month");

  88. else

  89. {

  90. $by_day = $repeat_array['BYDAY'][0];

  91. $by_day_week_number = substr($by_day, 0, 1);

  92. $ordinal = $ordinal_array[$by_day_week_number];

  93. $by_day_weekday = substr($by_day, 1);

  94. $day_index = array_search($by_day_weekday, $weekday_short_array);

  95. $dayname = $weekday_medium_array[$day_index];

  96. $timestamp = strtotime(date('c', $timestamp) . " +{$interval} month");

  97. $month = date('F', $timestamp);

  98. $year = date('Y', $timestamp);

  99. $relative_timestamp = "{$ordinal} {$dayname} of {$month} {$year}";

  100. $timestamp = strtotime($relative_timestamp);

  101. }

  102. break;

  103. case 'YEARLY' :

  104. $interval = isset($repeat_array['INTERVAL']) ? $repeat_array['INTERVAL'] : 1;

  105. $timestamp = strtotime(date('c', $timestamp) . " +{$interval} year");

  106. break;

  107. }

  108. $count++;

  109. if(

  110. (isset($limit['UNTIL']) && $timestamp > strtotime($limit['UNTIL'])) ||

  111. (isset($limit['COUNT']) && $count > $limit['COUNT']))

  112. {

  113. break;

  114. }

  115. $events[] = array(

  116. 'id' => $id,

  117. 'title' => $title,

  118. 'description' => $description,

  119. 'location' => $location,

  120. 'start_time' => date('c', $timestamp),

  121. 'end_time' => $timestamp + $elapsed_time);

  122. }

  123. }

  124. /* end modified code */

  125. }

While it isn't too tricky there is a lot of date manipulation that could go wrong during certain cases, like leap years and such. One thing that I had to remind myself several times is that this code is not calculating the 'correct' recurring dates but is trying to replicate Google Calendar processing as close as possible. After all, if you make a change to the source you expect it to flow exactly to any dependant sites, regardless of how either one is reading the data.

I did have to add some configuration information for some of the logic near the top of my script…

  1. $repeating_event_limit = 300;

  2. $weekday_short_array = array('SU', 'MO', 'TU', 'WE', 'TH', 'FR', 'SA');

  3. $weekday_medium_array = array('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat');

  4. $ordinal_array = array('zero', 'first', 'second', 'third', 'fourth', 'fifth');

And that pretty much does it! Again, it is important to test different cases - this script worked just fine for the scope of my project and the different tests we threw at it and may need some tweaks to work for yours. Also, it does not look at modified or deleted events from a repeating series. This is a nasty side effect to the way Google passes on the information, actually creating new events with status 'deleted' to mark holes in a recurring series. Just take a look at the eventstatus node if this is a concern.