An Interesting Exercise in Dynamic Programming

May 08, 2016

Given an array of build­ing heights (with unit width), cal­cu­late the area of the largest rec­tan­gle that “fits” within these build­ings.

For exam­ple, for an array A = [3,2,3,2,2,1]:

 x   x
 x x x x x
 x x x x x x

the cor­rect answer is 10.

You might want to attempt the ques­tion first before pro­ceed­ing fur­ther.

The brute force method takes O(n3) time by taking each pair and find­ing the min­i­mum, which is the “height”, and mul­ti­ply­ing it by the “width”.

How­ev­er, there’s actu­al­ly a linear time solu­tion.

Let us iter­ate through the build­ing array from left to right.

For some build­ing i, we want to find out the left and right bound­aries of the largest rec­tan­gle whose height is the height of build­ing i and which includes build­ing i itself.

Let’s talk about find­ing the left bound­ary first. To do this, we can, for each index i, iter­ate left­wards and check. This results in a qua­drat­ic time solu­tion. How­ev­er, we can do better than this. The insight to achiev­ing linear time is the fact that, when look­ing for the bound­ary of the rec­tan­gle, we can “throw away” build­ings to the left of i which are higher than build­ing i itself. In effect, we are only look­ing for the build­ing that will “pre­vent” the rec­tan­gle from extend­ing fur­ther. The reason we can do this safely is because, for future cal­cu­la­tions (of build­ings to the right of build­ing i), these build­ings won’t be con­sid­ered in any case because the (cur­rent) build­ing i is short­er than them and would be “bot­tle­neck­ing” them.

We can use a stack to do this. For a build­ing i, we push it onto the stack if it’s higher than the build­ing of the stack. If it’s not, we con­tin­u­ous­ly pop build­ings off the stack until the build­ing on the top of the stack is short­er than build­ing i. Since each build­ing is pushed on and popped off the stack at most once, this results in an amor­tized con­stant time check for each build­ing i. We repeat this linear-time pro­ce­dure twice, one in each direc­tion of the array, to obtain the left and right rec­tan­gle indices for each build­ing in the array.

At the end, we can cal­cu­late the largest rec­tan­gle by iter­a­tive­ly taking the dif­fer­ence in match­ing indices in both the left and right indices table and mul­ti­ply­ing it by the height of the build­ing.

# largest_rectangle :: [Int] -> Int
def largest_rectangle(arr):

    stack = []

    left_indices = []
    right_indices = []

    for i  in range(len(arr)):
        while stack and arr[stack[-1]] >= arr[i]:

        # if the stack is empty, it means we've extended the rectangle
        # all the way to the leftmost building
        # the left boundary index is set to -1, which means it includes building 0

        left_indices.append(-1 if not stack else stack[-1])

    # empty the stack for the right pass

    stack = []

    for i in range(len(arr) - 1, -1, -1):
        while stack and arr[stack[-1]] >= arr[i]:

        # if the stack is empty, it means we've extended the rectangle
        # all the way to the rightmost building
        # the right boundary index is set to len(arr), which means it includes building len(arr) - 1

        right_indices = [len(arr) if not stack else stack[-1]] + right_indices

    max_area = 0

    # arr[i] is the height of the current building
    # (right_indices[i] - left_indices[i] - 1) is the width

    for i in range(len(arr)):
        max_area = max(max_area, arr[i] * (right_indices[i] - left_indices[i] - 1))

    return max_area

Now, a dif­fer­ent ques­tion:

Given a 2-dimen­sion­al m by n matrix with only 0s and 1s, cal­cu­late the area of the largest rec­tan­gle that con­tains only 1s.

For exam­ple, for the matrix below:


The cor­rect answer is 8.

You can give it a try too, before pro­ceed­ing.

Sur­pris­ing­ly, there is also a linear time solu­tion for this prob­lem.

The insight to this prob­lem is two-fold. We can actu­al­ly make use of the tech­nique above, with some pre­pro­cess­ing.

The pre­pro­cess­ing step con­sists of cal­cu­lat­ing, for each A[i][j], the max­i­mum “height” of a down­ward exten­sion. If A[i][j] is 0, then the height is 0.

For the matrix above, the pre­processed height table would be:


This pre­pro­cess­ing can be done in O(mn), or linear time, if we start iter­at­ing from the last row upwards.

With this table, we can make use of the tech­nique above. By feed­ing each row of this height table into the largest_rectangle func­tion above, we can cal­cu­late the area of the largest rec­tan­gle whose top edge touch­es that row. If we do this for all the rows, we can cal­cu­late the largest pos­si­ble area for the entire matrix.

You can imag­ine it like this for the first row:

 x x   x
 x x

which yields an area of 4 cor­re­spond­ing the 2x2 rec­tan­gle in the upper-left corner of the matrix.

Or the over­all cor­rect solu­tion, which is the largest rec­tan­gle for the third row:

   x x x x
   x x x x

The full solu­tion in Python is below:

# largest_2d_subarray :: [[Int]] -> Int
def largest_2d_subarray(matrix):

    # we initialize the table to the same size as the matrix, containing all 0s

    max_height_table = [ [0] * len(matrix[0]) for i in range(len(matrix)) ]

    # we start preprocessing from the last row and work our way upwards

    for i in range(len(matrix) - 1, -1, -1):

        # special case for last row
        # we simply copy the last row of the matrix over to the height table
        # since the height can only be 0 or 1
        if i == len(matrix) - 1:
            max_height_table[len(matrix) - 1] = row

        for j, column in enumerate(matrix[i]):
            if column == 0:
            max_height_table[i][j] = max_height_table[i + 1][j] + 1

    # we can now feed the preprocessed table into our largest_rectangle function

    max_area = 0

    for i in range(len(matrix)):
        largest_subarray_area = largest_rectangle(max_height_table[i])
        max_area = max(max_area, largest_subarray_area)

    return max_area

As men­tioned, it takes linear time for the pre­pro­cess­ing. After that, it takes O(n) time to cal­cu­late the largest rec­tan­gle whose top is in con­tact with that row. For m rows, it takes O(mn) time. Thus, this algo­rithm runs in over­all linear time.