Saturday, October 9, 2010

Facebook Puzzles - Dance Battle

As I set out to solve the Dance Battle Facebook Engineering Puzzle (dancebattle), I realized that it was really just some sort of game (note: Facebook took down the original puzzle so I'm linking to another site with the description). It had alternating turns, repeatable moves, moves represented by non-negative integers, and was zero-sum. After some digging around, I realized that it was just a thinly-veiled version of a game of dominoes.

Naturally, it seemed fitting to present the solution to the puzzle in the form of the game it was modeled after. I had never played dominoes before, but I knew what the basic game pieces looked like since dominoes sets are among some of the more popular swag that certain companies like to give away at college career fairs. Nonetheless, I exercised due diligence and did some research on the game. Thus, I present to you Dance Battle, played out in the form of dominoes.

Dominoes!

A domino set consists of game pieces that are rectangular in shape, formed by joining two identically-sized, adjacent squares. Each square has a number on it, represented in the form of a die face. A traditional double-six domino set contains tiles with numbers ranging from zero to six, with a total of twenty eight unique tiles.

Consider a simple, modified, two-player game of dominoes using the traditional double-six set. The game starts with a player making an initial move by picking any of the twenty eight tiles, then placing it into play to form the line of play. The players alternatively extend the line of play by picking one of the remaining tiles to play so that adjacent tiles have equal values in their respective squares. In the variation, the line of play can only be extended in one direction.

For example, suppose the first player chooses to play the tile with numbers (3, 0) with the tile oriented such that 0 is at the tail of the line of play. Then the next player must pick one of the remaining tiles that contains 0, and place it such that the end with 0 is adjacent to the tail of the line of play. Players alternate until no more moves can be made; at that point, the person that played the last tile wins (and the player that is stuck loses).

Now, consider a generalized version of this variant where the numbers on the tiles can range from 0 to n. Given n and a list of tiles that have already been played, let's figure out how to determine the outcome if both players play optimally.

Exploring the Game Tree

In order to simulate optimal play, we need to look ahead in the game tree and determine the results of optimal play. Each level of the game tree represents the choices a player has at any point in the game. To play optimally, the player will seek the best possible result. Since the game is a two-player, zero-sum game, we can use the minimax algorithm to simulate optimal play.

The minimax algorithm performs an exhaustive search of the game tree space. At each level (height) of the game tree, it takes on the actions of a player and tries to maximize the outcome amongst all possible choices; equivalently, the algorithm will minimize the outcome for the other player. This can be done by calling the algorithm recursively at each level on all of the possible moves. When no more moves can be made, we reach an end game state, and a numerical value can returned to indicate the outcome of the game.

Before we actually implement the algorithm, we'll need to define a function that can generate the set of all possible tiles that can be played, given a list of tiles that have already been played, and the last tile that was played. A tile will be represented by a pair of integers.

def valid_tiles(n, played, last)
  valid = []

  # If no tiles have been played, all is fair game
  if last.nil?
    for i in 0..n-1
      for j in 0..n-1
        valid.push([i, j])
      end
    end

  # Valid tiles must have a value equal to the tail
  # value of the last tile
  else
    for i in 0..n-1
      t = [last[1], i]
      valid.push(t) if !played[t]
    end
  end

  return valid
end

Once we've defined a way for generating the set of valid moves, we can proceed with implementing the minimax algorithm. As our game of dominoes will always end up with a winner and a loser, the algorithm only has to return two values. To be consistent, we can return 1 in the case of a win for the player that goes first (a loss for the other player), and -1 in the case of a loss (a win for the other player).

def minimax(n, max, played, last)
  tiles = valid_tiles(n, played, last)

  # End game - return a value to indicate outcome
  return max ? -1 : 1 if tiles.empty?

  # Best possible outcome
  best = max ? -2 : 2

  tiles.each do |t|
    # Mark tile (and reverse, equivalent tile) as played
    rt = [t[1], t[0]]
    played[t] = played[rt] = true

    reply = minimax(n, !max, played, t)

    # Mark the tile as unplayed, to preserve game state
    # for further calls
    played.delete(t)
    played.delete(rt)

    best = reply if (max && reply > best) || (!max && reply < best)
  end

  return best
end

Note that in order to ensure that the first move and outcome is accepted when no moves have been explored yet, we set the initial value of the best possible outcome out of range. We now have an algorithm that will exhaustively explore the game tree space, and return the outcome of a game when both players play optimally.

This approach will always find the correct solution. However, since every possible game state needs to examined, the complexity of the algorithm grows exponentially with the depth of the tree. Furthermore, as the value of n increases, the number of possible tiles (and thus the branching factor) that can be played at each stage increases. It's easy to see that this approach will be too slow for even reasonable values of n. We should be able to do better.

Pruning

We can incorporate some simple techniques for pruning the game tree by stopping the tree traversal when we've encountered a state with our ideal outcome. Additionally, we can use alpha-beta pruning in order to cut down on the potential search space. Alpha-beta pruning allows us to stop the tree traversal when we discover a move that can, at best, only result in an outcome that is no better than the best one we've found so far. These pruning techniques allow us to significantly cut down on the search space.

def minimax_alpha_beta(n, max, played, last, alpha = -2, beta = 2)
  tiles = valid_tiles(n, played, last)
  return max ? -1 : 1 if tiles.empty?

  best = max ? alpha : beta
  tiles.each do |t|
    rt = [t[1], t[0]]
    played[t] = played[rt] = true

    reply = minimax_alpha_beta(n, !max, played, t, alpha, beta)

    played.delete(t)
    played.delete(rt)

    # We can stop once we've found the best possible outcome
    if max
      best = alpha = reply if reply > best
      return best if best >= 1
    else
      best = beta = reply if reply < best
      return best if best <= -1
    end

    # Can do no better than the best outcome so far
    return best if alpha >= beta
  end

  return best
end

Putting It Together

We can now put the functions we've defined together and simulate an optimal game of our variant of dominoes. For example, a game with n = 5, and a line of play of (0, 1), (1, 3) can be solved by the following program:

#!/usr/bin/ruby
n = 5
moves = [[0, 1], [1, 3]]

played = {}
moves.each { played[m] = played[[m[1], m[0]]] = true }
last = moves[moves.size - 1]
max = moves.size % 2 == 0

result = minimax_alpha_beta(n, max, played, last)
puts result == 1 ? 'Player 1 wins' : 'Player 2 wins'

If you enjoyed this post, you should follow me on Twitter @cloudkj. If you're interested in more puzzle fun, you should check out my posts about various other puzzles, such as Liar Liar, Gattaca, User Bin Crash, It's A Small World, and Refrigerator Madness.

Food For Thought
  • How can the program be modified to output the tile that the next player should play, given a line of play?
  • How can the algorithm be adapted to solve other zero-sum games, such as tic-tac-toe?
  • How would you handle the case of a tie game?
  • What is the run-time complexity of the program?

No comments:

Post a Comment