An Interesting Exercise in Dynamic Programming

Given an array of building heights (with unit width), calculate the area of the largest rectangle that “fits” within these buildings.

For example, for an array A = [3,2,3,2,2,1]:

 x   x
 x x x x x
 x x x x x x
[3,2,3,2,2,1]

the correct answer is 10.

You might want to attempt the question first before proceeding further.

The brute force method takes O(n^3^) time by taking each pair and finding the minimum, which is the “height”, and multiplying it by the “width”.

However, there’s actually a linear time solution.

Let us iterate through the building array from left to right.

For some building i, we want to find out the left and right boundaries of the largest rectangle whose height is the height of building i and which includes building i itself.

Let’s talk about finding the left boundary first. To do this, we can, for each index i, iterate leftwards and check. This results in a quadratic time solution. However, we can do better than this. The insight to achieving linear time is the fact that, when looking for the boundary of the rectangle, we can “throw away” buildings to the left of i which are higher than building i itself. In effect, we are only looking for the building that will “prevent” the rectangle from extending further. The reason we can do this safely is because, for future calculations (of buildings to the right of building i), these buildings won’t be considered in any case because the (current) building i is shorter than them and would be “bottlenecking” them.

We can use a stack to do this. For a building i, we push it onto the stack if it’s higher than the building of the stack. If it’s not, we continuously pop buildings off the stack until the building on the top of the stack is shorter than building i. Since each building is pushed on and popped off the stack at most once, this results in an amortized constant time check for each building i. We repeat this linear-time procedure twice, one in each direction of the array, to obtain the left and right rectangle indices for each building in the array.

At the end, we can calculate the largest rectangle by iteratively taking the difference in matching indices in both the left and right indices table and multiplying it by the height of the building.

# largest_rectangle :: [Int] -> Int
def largest_rectangle(arr):
 
    stack = []
 
    left_indices = []
    right_indices = []
 
    for i  in range(len(arr)):
        while stack and arr[stack[-1]] >= arr[i]:
            stack.pop()
 
        # if the stack is empty, it means we've extended the rectangle
        # all the way to the leftmost building
        # the left boundary index is set to -1, which means it includes building 0
 
        left_indices.append(-1 if not stack else stack[-1])
        stack.append(i)
 
    # empty the stack for the right pass
 
    stack = []
 
    for i in range(len(arr) - 1, -1, -1):
        while stack and arr[stack[-1]] >= arr[i]:
            stack.pop()
 
        # if the stack is empty, it means we've extended the rectangle
        # all the way to the rightmost building
        # the right boundary index is set to len(arr), which means it includes building len(arr) - 1
 
        right_indices = [len(arr) if not stack else stack[-1]] + right_indices
        stack.append(i)
 
    max_area = 0
 
    # arr[i] is the height of the current building
    # (right_indices[i] - left_indices[i] - 1) is the width
 
    for i in range(len(arr)):
        max_area = max(max_area, arr[i] * (right_indices[i] - left_indices[i] - 1))
 
    return max_area

Now, a different question:

Given a 2-dimensional m by n matrix with only 0s and 1s, calculate the area of the largest rectangle that contains only 1s.

For example, for the matrix below:

[1,1,0,1,0]
[1,1,1,0,0]
[0,1,1,1,1]
[0,1,1,1,1]

The correct answer is 8.

You can give it a try too, before proceeding.

Surprisingly, there is also a linear time solution for this problem.

The insight to this problem is two-fold. We can actually make use of the technique above, with some preprocessing.

The preprocessing step consists of calculating, for each A[i][j], the maximum “height” of a downward extension. If A[i][j] is 0, then the height is 0.

For the matrix above, the preprocessed height table would be:

[2,4,0,1,0]
[1,3,3,0,0]
[0,2,2,2,2]
[0,1,1,1,1]

This preprocessing can be done in O(mn), or linear time, if we start iterating from the last row upwards.

With this table, we can make use of the technique above. By feeding each row of this height table into the largest_rectangle function above, we can calculate the area of the largest rectangle whose top edge touches that row. If we do this for all the rows, we can calculate the largest possible area for the entire matrix.

You can imagine it like this for the first row:

[2,4,0,1,0]
 x x   x
 x x
   x
   x

which yields an area of 4 corresponding the 2x2 rectangle in the upper-left corner of the matrix.

Or the overall correct solution, which is the largest rectangle for the third row:

[0,2,2,2,2]
   x x x x
   x x x x

The full solution in Python is below:

# largest_2d_subarray :: [[Int]] -> Int
def largest_2d_subarray(matrix):
 
    # we initialize the table to the same size as the matrix, containing all 0s
 
    max_height_table = [ [0] * len(matrix[0]) for i in range(len(matrix)) ]
 
    # we start preprocessing from the last row and work our way upwards
 
    for i in range(len(matrix) - 1, -1, -1):
 
        # special case for last row
        # we simply copy the last row of the matrix over to the height table
        # since the height can only be 0 or 1
        if i == len(matrix) - 1:
            max_height_table[len(matrix) - 1] = row
            continue
 
        for j, column in enumerate(matrix[i]):
            if column == 0:
                continue
            max_height_table[i][j] = max_height_table[i + 1][j] + 1
 
    # we can now feed the preprocessed table into our largest_rectangle function
 
    max_area = 0
 
    for i in range(len(matrix)):
        largest_subarray_area = largest_rectangle(max_height_table[i])
        max_area = max(max_area, largest_subarray_area)
 
    return max_area

As mentioned, it takes linear time for the preprocessing. After that, it takes O(n) time to calculate the largest rectangle whose top is in contact with that row. For m rows, it takes O(mn) time. Thus, this algorithm runs in overall linear time.