Ursinus CS 271: Data Structures And Algorithms, Fall 2023

Assignment 6: Traveling Salesperson Art (35 Points)

Click here to see art contest results!

Overview/Logistics
- Learning Objectives
- What To Submit
Background: Traveling Salesperson Art
Programming Tasks

Overview / Logistics

For the last assignment in this class, you will explore an application of minimum spanning trees and depth first search to create computer generated art known as "traveling salesperson art."

Click here to download the starter code. You will have to write all of the code to do a traveling salesperson tour, but I have provided some image processing tools, as well example images.

Learning Objectives

Implement depth first search in python
Approximate traveling salesperson solutions using minimum spanning trees and the 2-opt heuristic
Create art programmatically

What To Submit

Submit a .py file or a notebook with your code to do approximate TSP, as well as your art contest submission, title, and name/pseudonym for the art contest

Background: Traveling Salesperson Art

There's a neat paper by Craig S. Kaplan and Robert Bosch on using the Traveling Salesperson Problem (TSP) to create art (Click here to read the paper). The first step is to use something called "Voronoi Stippling" (see paper here by Adrian Secord) to turn an image into a dot pattern. For instance, let's say we started with the following image of Emperor penguins, courtesy of Christopher Michel

And now let's say we ran the following code in the directory of the starter code

Then we would get this image:

Notice how the dots are more highly concentrated in darker regions of the image. This is what gives us the illusion of shading with just a distribution of dots. Also, as a programmatic note, note that X is an N x 2 array that holds all of the data, with the x coordinates in the first column and the y coordinates in the second column. Hence the slices in plt.scatter(X[:, 0], X[:, 1], 2)

The traveling salesperson art paper goes one step further, though. If we then treat each point as a "city," we can then search for a Traveling salesperson tour (i.e. a permutation of the dots that minimizes the length of the distances between adjacent dots, and between the first and the last dot). Let's suppose we indexed such a permutation with an array tour of indices, and then we ran the following code:

Then we get the following image:

If we use 10,000 points instead of 2000, we get an even prettier image, at the expense of more computation time:

Programming Tasks

This assignment will walk you through an approximation algorithm for the traveling salesperson problem. This problem is NP hard, so we can't expect to get a good exact solution. However, we will find an approximate solution to the problem which is provably within a factor of 2 of the best solution, and which is probalby much better in practice. And this will be good enough for artistic applications!

A 2-Approximation with MST DFS (15 Points)

As we noted in a module, it is possible to come up with a 2-approximation of the TSP by doing a depth first traversal of the minimum spanning tree of the points in 2D. As an example, here's what you should get on the penguins:

This is not bad for how quick the algorithm is! However, it's clearly not perfect, and in particular, we can see some places where the tour crosses over itself, which is a telltale sign that this is not optimal, because an optimal solution will have no crossings. We will exploit this in the next section to improve this solution.

NOTE: Your stipple/MST tour might look slightly different due to randomness of the stipple

Your Task

Create a method that takes in a stipple pattern as a matrix X (as shown above), and which returns a tour of X which is a 2-approximation of the best possible tour. You should

Compute the Delaunay triangulation of the points in X to narrow down the edge set E. Refer to the video here on how to do this in python.
Compute the MST of the graph (X, E) and store the resulting graph in an adjacency data structure
Run depth-first search on the graph from step 2, and fill in the tour by recording the order in which the nodes are visited.

An Improvement with The 2-Opt Heuristic (15 Points)

For a video explanation of the algorithm below, click here to view a video I have about this

One observation we can make to improve our initial guess of a tour is that an optimal solution will never contain a crossing because of the triangle inequality. This is most easily seen visually. The picture below shows how a crossing occurs between the edges between i and i+1 and j and j+1

But we can change this to another traveling salesperson tour by swapping the indices so that we use the edges from i to j and i+1 to j+1 instead of the edges between i and i+1 and j and j+1. This improves the overall tour length because the distances

\[ d(P_i, P_j) + d(P_{i+1}, P_{j+1}) < d(P_i, P_{i+1}) + d(P_j, P_{j+1}) \]

where P_k is the point at index k of the current tour

This flip can be realized by creating a new tour out of the following three chunks, in order:

Swapping Order

From index 0 to index i in the original tour
From index i+1 to index j+1 from the original tour in reverse order
From index j+1 to the end in the original tour

If we repeatedly do this in the penguin example, we see that we improve from the initial MST-based tour which has length 29020.836, to one which has a length of 22838.024, which is quite a big improvement! More importantly for our artistic application, this final tour has no crossings. The animation below shows this process step by step

Your Task

Implement the 2-opt heuristic to improve your MST-based solution. The pseudocode for this is as follows:

While an improvement is possible
- Look through each pair of edges in the current tour until you find a pair i, j where
  \[ d(P_i, P_j) + d(P_{i+1}, P_{j+1}) < d(P_i, P_{i+1}) + d(P_j, P_{j+1}) \]
  
  Be sure that i and j are not the first or last points in the tour, as swapping those might mess things up.
- Create a new tour by swapping the edges

You should write a helper method whose job it is to find the next two indices i and j whose edges can be swapped, which you call from within a loop while there are still indices to be swapped.

Another suggestion for speed is to pick up your search at the index i that you swapped last time, and then loop around if necessary.

A Note on Nested Loop Speed And Numba

You should implement this in pure python/numpy first, but the nested loops you have will be slow in pure python. You should start by testing this on a stipple with only 100-200 points. But to scale this up for the art contest, you should consider using numba with the decorator @jit(nopython=True) above your helper method so that this code runs efficiently. For example, I used this in an older version of get_centroids, a helper method in stipple.py to speed up a doubly nested loop I had.

On the penguin example with a 2000 point stipple, using numba was the difference between a 5 second runtime and 2 minute runtime on my machine

If you are having trouble installing or running numba on your own machine, you can use Google colab to run your code on the cloud. This will require you to have a Google account setup. Make a new notebook at https://colab.research.google.com/. Then, if you copy and paste the code from stipple.py into your colab notebook and you put your images in a directory called CS271/HW6/images, as shown in the image below:

then the following code should run:

(Of course, you can change the directory paths if you want to, but that's what I did in the above example)

Mandatory Art Contest (5 Points)

You just made a really cool program to create computer generated art! Use it to create a nice image, which will we show in a gallery for the class (you can use a pseudonym if you don't want your name to be up publicly). We will vote, and the winner will get extra credit! Here are a few suggestions to fine tune things:

The parameter thresh to stipple is a variable between 0 and 1 that determines the grayscale level above which to exclude samples. So make this closer to 0 if you only want to include darker parts of the image.
If you want to play with how line drawings for the edges show up, try adjusting the grad_sigma and edge_thresh parameters. A higher grad_sigma will smooth the edges more, and a lower edge_thresh will include more edges.
Usually using more points will lead to a prettier drawing, but it will take longer
If you're getting some undesirable points in the middle of an otherwise white region, you can filter them out with the density_filter method. For instance, X = density_filter(X, (X.shape[0]-10)/X.shape[0]) will filter out the 10 points that are furthest from their nearest neighbor.
You can change the colormap from 'magma_r' to get different color pattern over time. Click here to see the plethora of options.