Optimal Two-Sum

Inspired by Fizz Buzz in Tensorflow.


interviewer: Welcome! Thanks for coming down to our office.

me: Thanks for having me!

interviewer: Can I get you some coffee or tea?

me: I had plenty of both already!

interviewer: Haha, alright. So, are you familiar with “two-sum”?

me: …

interviewer: Do I take that as a no?

me: It’s more of a “why does this seem familiar?”

interviewer: Okay, so you know the drill. We have a list of numbers and a target, and I would like you to write an algorithm that gives me the two numbers from the list which add up to the target.

l = [2, 7, 1, 4, 6, 9, 10, 11, 12, 22]
target = 9

me: [stares at whiteboard for a couple of minutes]

interviewer: Let me know if you want to discuss the problem together.

me: That would be great actually. Let’s start with the standard imports:

from docplex.cp.model import CpoModel

mdl = CpoModel('two-sum')

interviewer: What is that?

me: Our constraint programming model, of course. Our decision variable is a binary list, each value indicating whether we select the corresponding number from our list l.

x = mdl.binary_var_list(len(l), name='choices')

interviewer: …

me: We have two constraints; we must select two numbers from the list, and those two numbers must sum to our target.

mdl.add(mdl.sum(x) == 2)
mdl.add(mdl.sum(x[i] * l[i] for i in range(len(l))) == target)

interviewer: Okay, are we done here?

me: You bet. All we need to do is solve and get the results.

msol = mdl.solve(TimeLimit=1)
print([l[i] for i in range(len(l)) if msol[x[i]] == 1])

me: [Quickly implements on laptop] This gives us [2, 7], as expected.

interviewer: I’m sorry, but this is not quite what I was looking for.

me: [Pauses] Hmm, okay. Let me try LP then?

interviewer: A what?

me: Linear programming.

interviewer: Linear? Okay, sounds great!

me: Now we’re talking. Again, let’s define our model, variables, and constraints.

from docplex.mp.model import Model

mdl = Model(name='two-sum')

# variables
x = mdl.binary_var_list(len(l), name='x')

# constraints
mdl.add_constraint(mdl.sum(x) == 2)
mdl.add_constraint(mdl.sum(x[i]*l[i] for i in range(len(l))) == target)

interviewer: …

me: This time we need to define an objective function and minimize it. For this problem, we can simply minimize how many numbers we choose from l, which is the sum over x.

mdl.minimize(mdl.sum(x))

msol = mdl.solve()

print([l[i] for i in range(len(l)) if x[i].solution_value == 1])

me: This should also give us the same answer. Is this what you were looking for?

interviewer: This seems a bit too complicated.

me: You think so? …hmm, on second thought you might be right. Maybe we should get rid of our second equality constraint.

\[\begin{align*} \min &\qquad \sum_w x_w& \\ \text{subject to:} &&\\ & \sum_w x_w = 2 & (1) \\ & \sum_w l_w x_w = 9 & (2) \\ & x_w \in \mathbb{B} & \forall w \end{align*}\]

me: Assuming our second constraint is hard, we can apply a Lagrangian relaxation and simplify the problem by moving that constraint to the objective function.

\[\begin{align*} \min &\qquad \sum_w x_w + \mu(9-\sum_w l_w x_w) & \\ \text{subject to:} &\\ & \qquad \sum_w x_w = 2 \qquad (1) \\ & \qquad x_w \in \mathbb{B} \qquad \forall w \end{align*}\]

me: So we just need to implement this subproblem and see how much we violated the original constraint for a given solution.

def lagrangian_relaxation(mu, l, target):
    mdl = Model(name='lr')

    # Variable
    x = mdl.binary_var_list(len(l), name='x')

    # Constraint
    mdl.add_constraint(mdl.sum(x) == 2)

    # Objective
    obj = mdl.sum(x) + mu * (target - mdl.sum(l[w] * x[w] for w in range(len(l))))

    # Solve
    mdl.minimize(obj)
    msol = mdl.solve()

    # How much does our solution violate the original constraint?
    x_solution = [x[w].solution_value for w in range(len(l))]
    relaxation = target - sum(l[w] * x_solution[w] for w in range(len(l)))

    return (mdl.objective_value, msol, x_solution, relaxation)

interviewer: …

me: Right, so what is mu, our Lagrangian multiplier, and how do we find it? We have a few options… how about the subgradient method? All we need to do is iteratively update mu based on how much we violated the original constraint. We can stop updating mu when our “loss” on the original constraint is zero.

\[\mu_{t+1} = \mu_{t} + \eta (9 - \sum_w l_w x_w^t)\]

We’re left with implementing the loop and putting the pieces together.

def subgradient(mu, step, l, target):
    print('List of numbers:', l)
    print('Target:', target)
    iteration = 0

    while True:
        iteration += 1
        # Solve LR
        obj, msol, x_s, relaxation = lagrangian_relaxation(mu, l, target)

        # Check termination condition
        if relaxation == 0:
            print('Iter: {} | mu: {} | r: {} | x: {}'.format(
                iteration, mu, relaxation, [l[i] for i in range(len(l)) if x_s[i] == 1]))
            print('Final solution:', [l[i] for i in range(len(l)) if x_s[i] == 1])
            return x_s

        # Update mu
        mu = mu + step * relaxation
        print('Iter: {} | mu: {} | r: {} | x: {}'.format(
            iteration, mu, relaxation, [l[i] for i in range(len(l)) if x_s[i] == 1]))

me: Now how should we set mu and step initially? 10 for mu?

interviewer: [squints eyes]

me: Yeah, perhaps a bit too small. Let’s set mu to 20 and step to 1.0.

In [1]: subgradient(mu=20, step=1.0, l=l, target=target)
Out [1]:

List of numbers: [2, 7, 1, 4, 6, 9, 10, 11, 12, 22]
Target: 9
Iter: 1 | mu: -5.0 | r: -25.0 | x: [12, 22]
Iter: 2 | mu: 1.0 | r: 6.0 | x: [2, 1]
Iter: 3 | mu: -24.0 | r: -25.0 | x: [12, 22]
Iter: 4 | mu: -18.0 | r: 6.0 | x: [2, 1]
Iter: 5 | mu: -12.0 | r: 6.0 | x: [2, 1]
Iter: 6 | mu: -6.0 | r: 6.0 | x: [2, 1]
Iter: 7 | mu: 0.0 | r: 6.0 | x: [2, 1]
Iter: 8 | mu: 0.0 | r: 0.0 | x: [2, 7]
Final solution: [2, 7]

me: That did the trick! We managed to get a relaxation of 0 and thus the solution for the original problem. I think we are done.

interviewer: [Takes a deep breath]. Okay, you said linear programming. Does that mean the runtime of this algorithm is $O(n)$?

me: You mean the simplex algorithm behind mdl.solve()? Haha, I wish. The worst case is $O(2^n)$.

interviewer: [FLIPS TABLE]


Indeed, this interview did not happen. It is an ode to the Joel Grus blog post I read when I first picked up Tensorflow five years ago (tf.placeholder, anyone?).

I had recently studied discrete optimization and was in the mood of applying it to something fun when I was somehow reminded of Fizz Buzz.

Thank you Joel for the inspiration!