Slightly Shuffled

It's pretty easy to shuffle a list/array in PHP. There's a function, shuffle(), that will simply randomize the order of elements. It also assigns new keys to the array, possibly a side effect of the internal algorithm that it uses. However, if you want to partially shuffle an array, there is no native PHP function to help out.

Why would you want to partially shuffle an array? Well, maybe order is only kinda important and it may be easier to estimate proper placement. In my case I had triggered events, events that needed action taken (like a rudimentary queue), though the order was only loosely important. Another case could be found with popular articles or recent comments: you may want to let the 'popularity' or 'recency' weigh the display order but you don't want it to be hard-and-fast rule.

Without a built-in PHP function or a way to modify the native shuffle I started playing around with a few ideas. My first thought was chunking. I could break the main array into pieces of length tolerance (with tolerance being the max offset of any given element), shuffle each piece, and then put it back together. This didn't feel right. Even if the chunks were variable length, between 1 and tolerance long, it still felt like too hard of boundaries for something that was still kinda random.

After playing around with chunking for a while I started researching how shuffling algorithms worked. Thanks to Jeff Atwood I was introduced to the Fisher and Yates method (popularized by Knuth). Pick a random element in the array, pluck it out, and push it on the end of a new array. Continue until there are no elements left.

This algorithm does involve having two separate arrays, though, which is not necessary. If you push the new elements to the end of the current array and keep track of your pointer then there is no need for a new array. This idea was introduced Durstenfeld and is a great optimization for languages that require length definitions on lists for memory purposes.

So I started down this route. One of the key differences in this algorithm and my needs was the need to maintain a semblence of order: I had to limit the plucked element based on current position and tolerance, the plucked elements had to be placed in the right order, and there were special cases to watch out for. Well, two special cases.

Shuffling with tolerance of two

Shuffling with tolerance of two

This diagram shows how the modified logic works. First, I need to track the original key placement to make sure that the chosen element does not exceed the tolerance bounds. If the pointer is at the maximum bound (line 3) then that element is forced to be the choice. Oh, and if the pointer is at the top of the array than there is also no need to look around. Otherwise, pick a random key from (pointer to pointer - tolerance) and pluck. Second, place the plucked element at the beginning of the 'new array', or pointer + 1. And that's it.

The code I came up with is online at on my Github page. I had a little fun with namespaced functions and such; beyond that its fairly basic. One piece I'm a bit uncertain about is if it would make more sense to use an alernate looping procedure, if using next() and current() could simplify the logic in the main working loop, but this works. Oh, and just like the native function, it strips out keys. It makes things easier.