Wednesday, March 23, 2011

Facebook Puzzles - Refrigerator Madness

For software engineers, free drinks at a place of employment equates to a certain level of prestige. If a company cares about you enough to always keep the refrigerator fully stocked with caffeinated beverages, then it must be a great place to work. It's sort of silly, considering how bad most soft drinks are for your health. Add that to the fact that most software engineers nary get any semblance of exercise - or even much daylight for that matter - and you have a dangerous combination on your hands.

Nevertheless, free food and drinks, no matter how unhealthy, are always popular staples at hot technology companies. Facebook is no exception, which is why they released a programming puzzle to solicit help for optimizing productivity amongst their soda-drinking, caffeine-slinging engineers.

The Refrigerator Madness puzzle (fridgemadness) is interesting not because it deals with matching engineers with beverages, but rather because it presents the task at hand in the form of a classic mathematical and computer science problem. In fact, this problem has many real world applications, and has been used for things like pairing college students up with internship programs, or paring medical students up with residency programs.

Match Madness

The input of the Refrigerator Madness puzzle is simple: there is an even number of engineers, an arbitrary list of types of drinks, and a preference list for every engineer that indicates his or her preferred drinks of choice, in order. At its core, the puzzle is a matching problem, as the goal is to match each upper 50th percentile engineer to a lower 50th percentile engineer in the other set. The key insight to the puzzle is the requirement that the final matching between engineers must be "stable" for every pair of engineers. That is, every matching pair of engineers must not prefer another engineer with whom he or she is not already paired with. This problem is an instance of the stable marriage problem, and it's actually easier to think about the puzzle in this context.

In the stable marriage problem, the goal is to match N men up with N men. A stable matching is a set of matches where no arbitrary pair of a man or a woman would prefer each other over the person to whom they're already matched up with. That is, everyone is happy with their matches and none of the pairs need to change, and thus are stable. A prerequisite for the stable marriage problem is that all men and women should have a preference list which ranks the members of the opposite sex in order of preference.

Likewise, the Refrigerator Madness puzzle provides such a definition for the degree of preference between two engineers, using their drink preferences as a proxy. That is, an "engineer's preference to work with another engineer can be calculated by comparing the drink preferences of each engineer. An engineer prefers to work with one engineer over another if the preference score is higher." We can easily write a method for calculating the preference scores between a pair of engineers, where n is the number of distinct drinks:

def score(a, b, n)
  score_ab = score_ba = (n * (n + 1)) / 2
  for i in 0..n-1
    base_a = n - i
    for j in 0..n-1
      base_b = n - j
      if a[i] == b[j]
        if i < j
          score_ab += base_a
        elsif i > j
          score_ba += base_b
        else
          score_ab += base_a * base_a
          score_ba += base_b * base_b
        end
        score_ab -= base_a
        score_ba -= base_b
      end
    end
  end
  return [score_ab, score_ba]
end

Although the preferences are one-directional, we can take a few shortcuts and be more efficient in calculating the preference scores between two engineers. Once we have this method handy, we can tackle the heart of the puzzle.

I Choose You, You Choose Me

The primary motivation for calculating preference scores between pairs of engineers is to establish an ordered preference list for each engineer. It turns out that the actual drinks are irrelevant; we only need to know how many types of drinks are available. Let's assume that we have these preference lists readily available, in the form of arrays indexed by the drink id.

def scores(drink_prefs, num_engineers)
  # Median is the first engineer in lower 50th percentile
  median = num_engineers / 2
  scores = {}

  # Compute scores between all pairs of engineers
  for i in 0..median-1
    scores[i] ||= {}
    p1 = drink_prefs[i]
    n = p1.size
    for j in median.. num_engineers-1
      scores[j] ||= {}
      p2 = drink_prefs[j]
      scores[i][j], scores[j][i] = score(p1, p2, n)
    end
  end
  return scores
end

Since the number of engineers is always even and all have consecutive ids starting at 0, we can use a shortcut of taking the median id as the cut-off between the top 50th percentile and the bottom 50th percentile. Additionally, we know the total number of engineers, so we can use these pieces of information to calculate preference lists.

def prefs(scores, num_engineers)
  median = num_engineers / 2
  prefs = {}

  # Compute preference lists for all engineers
  for i in 0..median-1
    sorted = scores[i].sort do |a, b|
      v = b[1] <=> a[1]

      # Tie break with engineer effectiveness, which is their id
      v = b[0] <=> a[0] if v == 0
      v
    end
    prefs[i] = sorted.map { |s| s[0] }
  end
  return prefs
end

Matchmaker, Matchmaker

Note that we only calculate preference lists for the top 50th percentile of engineers. Later on, we'll see why we can utilize this optimization. Armed with the preference scores between engineers and the preference lists, we can start finding stable matchings. The Gale-Shapley algorithm is a method of establishing stable matchings by going through rounds of matchings. Since the algorithm is intended to match individuals from one set with individuals from another set, we naturally divide the engineers into the top 50th percentile and the bottom 50th percentile. To begin, the pool of engineers to be matched includes all the engineers in the top 50th percentile. We don't actually need to keep track of a pool of engineers from the bottom 50th percentile, since a matching by definition means that one engineer from each groups is selected.

In each iteration of the algorithm, we select an engineer (from the top 50th percentile) from the pool. We attempt to match the engineer with the candidate (an engineer in the bottom 50th percentile) highest on his or her preference list, and remove the candidate from the list. If the candidate is unmatched, then we establish the matching. If the candidate is already matched, we try to see if we can improve the matching. If the candidate happens to prefer this new engineer more than the engineer he or she is already matched to, then we match the candidate to the new engineer and put the old engineer back into the pool; otherwise, nothing changes. The algorithm goes through as many rounds as it needs until all engineers are matched, and guarantees that all engineers will eventually be matched.

def matches(scores, prefs, num_engineers)
  median = num_engineers / 2
  matches = {}

  # Initial pool of engineers
  pool = (0..median-1).to_a

  # Keep matching engineers until the pool is exhausted
  while !pool.empty?
    hi = pool.shift
    lo = prefs[hi].shift
    if existing = matches[lo]

      # Match exists; decide whether to break it up
      a = scores[lo][hi]
      b = scores[lo][existing]
      if a > b || (a == b && hi > existing)
        matches[lo] = hi
        pool.push(existing)
      else
        pool.push(hi)
      end
    else
      matches[lo] = hi
    end
  end

  # Pair up engineers based on matches
  pairs = []
  matches.each do |a, b|
    pairs.push(a < b ? [a, b] : [b, a])
  end
  return pairs
end

Technically, we could've also calculated the preference lists for the engineers from the bottom 50th percentile, and used these lists to whether or not to break up existing matches. However, since we have available to us the notion of preference scores between all pairs of engineers, we can be slightly more efficient by using this to do a constant time lookup to determine if an engineer prefers one over another.

Putting it all together, we just need to sort the pairs of engineers in ascending order, and we're done.

#!/usr/bin/ruby
num_engineers = 6
drink_prefs = {
  0 => [0, 2, 1],
  1 => [1, 2, 0],
  2 => [2, 1, 0],
  3 => [1, 0, 2],
  4 => [2, 0, 1],
  5 => [0, 1, 2],
}

scores = scores(drink_prefs, num_engineers)
prefs = prefs(scores, num_engineers)
pairs = matches(scores, prefs, num_engineers)

pairs.sort { |a, b| a[0] <=> b[0] }.each do |p|
  puts p.join(',')
end

All the engineers at Facebook can now work and co-exist in love, peace, and harmony, and enjoy their caffeine-fueled extreme programming sessions.

If you enjoyed this post, you should follow me on Twitter @cloudkj. If you're interested in more puzzle fun, you should check out my posts about various other puzzles, such as Liar Liar, Gattaca, Dance Battle, User Bin Crash, and It's A Small World.

Food For Thought
  • Does the algorithm achieve an optimal matching for all engineers?

No comments:

Post a Comment