Assignment 6: Traveling Salesperson Art (35 Points)
Chris Tralie
Click here to see art contest results!
Overview / Logistics
For the last assignment in this class, you will explore an application of minimum spanning trees and depth first search to create computer generated art known as "traveling salesperson art."
Click here to download the starter code. You will have to write all of the code to do a traveling salesperson tour, but I have provided some image processing tools, as well example images.
Learning Objectives
 Implement depth first search in python
 Approximate traveling salesperson solutions using minimum spanning trees and the 2opt heuristic
 Create art programmatically
What To Submit
Submit a .py file or a notebook with your code to do approximate TSP, as well as your art contest submission, title, and name/pseudonym for the art contest
Background: Traveling Salesperson Art
There's a neat paper by Craig S. Kaplan and Robert Bosch on using the Traveling Salesperson Problem (TSP) to create art (Click here to read the paper). The first step is to use something called "Voronoi Stippling" (see paper here by Adrian Secord) to turn an image into a dot pattern. For instance, let's say we started with the following image of Emperor penguins, courtesy of Christopher Michel
And now let's say we ran the following code in the directory of the starter code
Then we would get this image:
Notice how the dots are more highly concentrated in darker regions of the image. This is what gives us the illusion of shading with just a distribution of dots. Also, as a programmatic note, note that X
is an N x 2 array that holds all of the data, with the x coordinates in the first column and the y coordinates in the second column. Hence the slices in plt.scatter(X[:, 0], X[:, 1], 2)
The traveling salesperson art paper goes one step further, though. If we then treat each point as a "city," we can then search for a Traveling salesperson tour (i.e. a permutation of the dots that minimizes the length of the distances between adjacent dots, and between the first and the last dot). Let's suppose we indexed such a permutation with an array tour
of indices, and then we ran the following code:
Then we get the following image:
If we use 10,000 points instead of 2000, we get an even prettier image, at the expense of more computation time:
Programming Tasks
This assignment will walk you through an approximation algorithm for the traveling salesperson problem. This problem is NP hard, so we can't expect to get a good exact solution. However, we will find an approximate solution to the problem which is provably within a factor of 2 of the best solution, and which is probalby much better in practice. And this will be good enough for artistic applications!
A 2Approximation with MST DFS (15 Points)
As we noted in a module, it is possible to come up with a 2approximation of the TSP by doing a depth first traversal of the minimum spanning tree of the points in 2D. As an example, here's what you should get on the penguins:
This is not bad for how quick the algorithm is! However, it's clearly not perfect, and in particular, we can see some places where the tour crosses over itself, which is a telltale sign that this is not optimal, because an optimal solution will have no crossings. We will exploit this in the next section to improve this solution.
NOTE: Your stipple/MST tour might look slightly different due to randomness of the stipple
Your Task
Create a method that takes in a stipple pattern as a matrix X
(as shown above), and which returns a tour
of X
which is a 2approximation of the best possible tour. You should
 Compute the Delaunay triangulation of the points in X to narrow down the edge set E. Refer to the video here on how to do this in python.
 Compute the MST of the graph (X, E) and store the resulting graph in an adjacency data structure
 Run depthfirst search on the graph from step 2, and fill in the tour by recording the order in which the nodes are visited.
An Improvement with The 2Opt Heuristic (15 Points)
 For a video explanation of the algorithm below, click here to view a video I have about this
One observation we can make to improve our initial guess of a tour is that an optimal solution will never contain a crossing because of the triangle inequality. This is most easily seen visually. The picture below shows how a crossing occurs between the edges between i
and i+1
and j
and j+1
But we can change this to another traveling salesperson tour by swapping the indices so that we use the edges from i
to j
and i+1
to j+1
instead of the edges between i
and i+1
and j
and j+1
. This improves the overall tour length because the distances
\[ d(P_i, P_j) + d(P_{i+1}, P_{j+1}) < d(P_i, P_{i+1}) + d(P_j, P_{j+1}) \]
where P_{k} is the point at index k of the current tourThis flip can be realized by creating a new tour out of the following three chunks, in order:
Swapping Order
 From index
0
to indexi
in the original tour 
From index
i+1
to indexj+1
from the original tour in reverse order 
From index
j+1
to the end in the original tour
If we repeatedly do this in the penguin example, we see that we improve from the initial MSTbased tour which has length 29020.836, to one which has a length of 22838.024, which is quite a big improvement! More importantly for our artistic application, this final tour has no crossings. The animation below shows this process step by step
Your Task
Implement the 2opt heuristic to improve your MSTbased solution. The pseudocode for this is as follows:

While an improvement is possible

Look through each pair of edges in the current tour until you find a pair i, j where
\[ d(P_i, P_j) + d(P_{i+1}, P_{j+1}) < d(P_i, P_{i+1}) + d(P_j, P_{j+1}) \]
Be sure that i and j are not the first or last points in the tour, as swapping those might mess things up.
 Create a new tour by swapping the edges

Look through each pair of edges in the current tour until you find a pair i, j where
You should write a helper method whose job it is to find the next two indices i
and j
whose edges can be swapped, which you call from within a loop while there are still indices to be swapped.
Another suggestion for speed is to pick up your search at the index i
that you swapped last time, and then loop around if necessary.
A Note on Nested Loop Speed And Numba
You should implement this in pure python/numpy first, but the nested loops you have will be slow in pure python. You should start by testing this on a stipple with only 100200 points. But to scale this up for the art contest, you should consider using numba with the decorator @jit(nopython=True)
above your helper method so that this code runs efficiently. For example, I used this in an older version of get_centroids
, a helper method in stipple.py
to speed up a doubly nested loop I had.
On the penguin example with a 2000 point stipple, using numba was the difference between a 5 second runtime and 2 minute runtime on my machine
If you are having trouble installing or running numba
on your own machine, you can use Google colab to run your code on the cloud. This will require you to have a Google account setup. Make a new notebook at https://colab.research.google.com/. Then, if you copy and paste the code from stipple.py
into your colab notebook and you put your images in a directory called CS271/HW6/images
, as shown in the image below:
then the following code should run:
(Of course, you can change the directory paths if you want to, but that's what I did in the above example)
Mandatory Art Contest (5 Points)
You just made a really cool program to create computer generated art! Use it to create a nice image, which will we show in a gallery for the class (you can use a pseudonym if you don't want your name to be up publicly). We will vote, and the winner will get extra credit! Here are a few suggestions to fine tune things:

The parameter
thresh
to stipple is a variable between 0 and 1 that determines the grayscale level above which to exclude samples. So make this closer to 0 if you only want to include darker parts of the image. 
If you want to play with how line drawings for the edges show up, try adjusting the
grad_sigma
andedge_thresh
parameters. A highergrad_sigma
will smooth the edges more, and a loweredge_thresh
will include more edges.  Usually using more points will lead to a prettier drawing, but it will take longer

If you're getting some undesirable points in the middle of an otherwise white region, you can filter them out with the
density_filter
method. For instance,X = density_filter(X, (X.shape[0]10)/X.shape[0])
will filter out the 10 points that are furthest from their nearest neighbor. 
You can change the colormap from
'magma_r'
to get different color pattern over time. Click here to see the plethora of options.