Randomly Shuffling

Wishlist for Shuffling N Elements

  1. Sample uniformly from all N! permutations
  2. It is in-place, meaning we don't need an auxiliary array to store things as we're shuffling
  3. We want it to be O(N) time complexity

Let's look at an example of something that satisfieds #2 and #3, but not #1

In [1]:
import numpy as np
import matplotlib.pyplot as plt
In [2]:
def swap(arr, i, j):
    temp = arr[i]
    arr[i] = arr[j]
    arr[j] = temp
In [20]:
def bad_shuffle(N):
    arr = np.arange(N)
    for i in range(N):
        j = np.random.randint(N)
        swap(arr, i, j)
In [6]:
def perm(arr, perm2num, idx = 0):
    """
    Recursively enumerate all N! permutations of the elements in
    an array as strings, and store them in a dictionary 
    that gives them an index
    
    Parameters
    ----------
    arr: list
        This is a list we're permuting in place
    perm2num: dictionary
        A dictionary which converts from a permutation, as expressed
        as a string, into an index
    idx: int
        What is the index of the element we are choosing in the array
    """
    if idx == len(arr)-1:
        s = "{}".format(arr)
        perm2num[s] = len(perm2num)
    else:
        # Assume that everything before idx is fixed, but 
        # we want to take something that's after idx and swap
        # into idx and keep going
        for i in range(idx, len(arr)):
            swap(arr, i, idx) # Swap in arr[i] for arr[idx]
            perm(arr, perm2num, idx+1) # Keep going recursively to figure out arr[idx+1]
            swap(arr, i, idx) # Swap back, and try something else

We can see experimentally when we do many samples that this method does not give an even sampling over all 5! = 120 permutations of length 5

In [ ]:
N = 5 # 120 permutations
perm2num = {}
perm(np.arange(N), perm2num)
trials = 100000
counts = np.zeros(len(perm2num))
counts2 = np.zeros(len(perm2num))
for trial in range(trials):
    arr = get_bad_shuffle(N)
    idx = perm2num["{}".format(arr)]
    counts[idx] += 1
    arr = get_shuffle(N)
    idx = perm2num["{}".format(arr)]
    counts2[idx] += 1
In [19]:
plt.figure(figsize=(16, 4))
plt.subplot(121)
plt.bar(np.arange(len(counts)), counts)
plt.title("Bad Shuffle")
plt.subplot(122)
plt.bar(np.arange(len(counts2)), counts2)
plt.title("Fisher-Yates")
Out[19]:
Text(0.5, 1.0, 'Fisher-Yates')

Fisher-Yates (Manual Hand Trace Example)

Let's suppose we started with

0 1 2 3 4 5

Iterate through i = 0 to 4

Choose a random index j between 0 and 5, inclusive

In [28]:
j = np.random.randint(6) # One past the biggest element we can possibly choose
print(j)
2

Swap what's at index 2 with what's at index 5

0 1 5 3 4 2

Moved up to i = 1

Choose a random index j between 0 and 4, inclusive

In [29]:
j = np.random.randint(5)
print(j)
0

Swap what's at index j = 0 with index 4

4 1 5 3 0 2

Move up to i = 2

Choose a random index between 0 and 3, inclusive

In [30]:
j = np.random.randint(4)
print(j)
1

Swap what's at index 1 with what's at index 3

4 3 5 1 0 2

Move up to i = 3

Choose a random index j between 0 and 2, inclusive

In [31]:
j = np.random.randint(3)
print(j)
1

Swap what's at index j = 1 with what's at index 2

4 2 3 1 0 2

Move up to i = 4

Pick a random index j between 0 and 1, inclusive

In [32]:
j = np.random.randint(2)
print(j)
0

Swap what's at index 0 with what's at index 1

2 4 3 1 0 2

In [ ]: