Sunday, November 16, 2008

Shuffling

Shuffling is a very interesting programming problem, Almost everybody can come up with a good algorithm using a simple rand() function, but it gets a little tricky when one has to perform a in place shuffle (i.e. w/o using any extra memory).

Knuth's Algorithm described in the Art of Computer programming, Vol 2 (which is based on Fisher Yates algorithm) is regarded as one the best known algorithm for the problem.

The description on wikipedia is a little clearer and goes like this:
1. Let A1 := 1, A2 := 2 and so on up to AN := N, and let n := N.
2. Pick a random number k between 1 and n inclusive.
3. If k ≠ n, swap the values of Ak and An.
4. Decrease n by one.
5. Repeat from step 2 until n is less than 2.


public static void shuffle (int[] array)
{
Random rng = new Random(); // i.e., java.util.Random.

// The number of items left to shuffle (loop invariant).
int n = array.length;

while (n > 1)
{
int k = rng.nextInt(n); // 0 <= k < n.
n--; // n is now the last pertinent index;
// swap array[n] with array[k] (does nothing if k == n).
int temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}


Alternately, instead of going n to 1 we can do a forward pass like here:


int N = array.length;
for (int i = 0; i < N; i++)
{
int r = i + (int) (Math.random() * (N-i));
int t = array[r];
array[r] = array[i];
array[i] = t;
}


An alternative method is we can assign each element of the set to be shuffled a random number and the set is then sorted according to these numbers. This may be faster in practice, despite having worse asymptotic time complexity (O(n log n) vs. O(n)).

Like the Fisher-Yates shuffle, this method will also produce unbiased results if correctly implemented, and may be more tolerant of certain kinds of bias in the random numbers. However, care must be taken to ensure that the assigned random numbers are never duplicated, since sorting algorithms in general won't order elements randomly in case of a tie.

Wikipedia also defines a interesting algorithm very similar to the Knuth's algorithm called the Sattolo algorithm for generating uniformly distributed cyclic permutations. The only difference between the 2 algorithms is that in Sattolo's algorithm in step 2, the random number k is chosen from the range between 1 and n−1 (rather than between 1 and n) inclusive.

To turn the Java example above into an example of Sattolo's algorithm, simply replace rng.nextInt(n) with rng.nextInt(n-1) in the code. This simple change modifies the algorithm so that the resulting permutation always consists of a single cycle.

Reference:
1. TAOCP Vol 2 Section 3.4.5 by Donald Knuth
2. http://en.wikipedia.org/wiki/Knuth_shuffle

No comments:

Related Posts with Thumbnails