k-th Largest Number from a List of Digits

November 12, 2019

Given a list of digits (possibly containing duplicates) from 0-9, and a number k, find the k-th largest number that can be formed from these digits. Assume that k-th largest number can be formed.

When handling 0s, treat leading zeroes as ignored. E.g. for [0,1] and k = 1, the answer is 10, and for k = 2, the answer is 1.

My idea was to think of it as a decision tree. Each node in the tree represents a choice of picking a particular unique digit. Each node also has an ordered set of children, which represents the remaining unique digits in the list, ordered from smallest to largest. Then, the leaves of this tree, when viewed from left to right, represent the possible numbers, from smallest to largest. We could traverse this tree and find the correct number.

We can make use of some basic math to prune the tree. What we know is that for a node, we can count the number of leaves of the subtree rooted at that node. The number of leaves is equivalent to the number of permutations of the remaining digits at that node. For example, for the remaining digits 1,1,2, we know that the number of permutations is given by:

factorial(number of digits) divided by (product(factorial(cardinality of each number)))

So, in this case, the number is 3! divided by ((2 = number of 1s)! * (1 = number of 2s)!) = 6 / 2 = 3. This corresponds to 112, 121, and, 211.

So, for a given k and at a given node, we can calculate the size of the subtrees of each child from right to left, and skip over the children until we find the correct child that k falls in. We then recurse until we reach the end of the tree (given by the base case of len(digits) == 1).

No special handling is needed for leading zeroes. We simple preserve the result as strings until we are ready to convert the string into an integer, at the top of the recursion. Python will handling leading zeroes correctly for us. This is why I have a separate nth driver function that is used for recursion.

It’s not the most efficient implementation ever, but I feel it’s more convenient and readable. Time complexity is dominated by the recursion and the processing at each level, which involves sorting and calculating the permutations. At each of the n levels of recursion, we need to sort, and the worst case is that we need to calculate the permutations for every single child (this pathological case happens when we are looking for the smallest number), so the time complexity at each level is O(nlogn + n^2). The overall complexity is thus O(n^2logn + n^3).

from typing import List
from collections import Counter
from math import factorial
from functools import reduce
from operator import mul


def permutations(c: Counter) -> int:
    """
    Given a list of digits, return the number of possible permutations.
    """
    values = c.values()
    return factorial(sum(values)) / (reduce(mul, [factorial(x) for x in values]))


def nth(digits: List[int], k: int) -> str:
    # base case
    if len(digits) == 1:
        return str(digits[0])
    # get a list of unique digits, sorted in descending order
    unique_numbers = sorted(list(set(digits)), reverse=True)
    c = Counter(digits)
    # this pointer refers to the current digit we are taking out
    p = -1
    while True:
        p += 1
        # this has the effect of "taking out" a digit from the list of digits
        c[unique_numbers[p]] -= 1
        # check how many permutations the rest of the digits have
        current_permutations = permutations(c)
        if current_permutations >= k:
            break
        # iterate until we reach a "bucket" where the kth number falls into
        else:
            k -= current_permutations
            # put the current digit back
            c[unique_numbers[p]] += 1
    # once identified, recurse into the bucket
    rest = nth(list(c.elements()), k)
    return str(unique_numbers[p]) + rest


def nth_largest_number(digits: List[int], k: int) -> int:
    return int(nth(digits, k))

Here are some test cases that I’ve tested:

class KthLargestNumber(unittest.TestCase):
    def nth(self, x, y, z):
        self.assertEqual(nth_largest_number(x,y), z)

    def test_sanity(self):
        self.nth([1,2], 2, 12)
        self.nth([1,2], 1, 21)
        self.nth([1,2,3], 1, 321)
        self.nth([1,2,3], 2, 312)
        self.nth([1,2,3], 6, 123)
        self.nth([1,2,3,4], 18, 2134)
        self.nth([1,2,3,4], 24, 1234)
        self.nth([0,1], 2, 1)
        self.nth([0,1], 1, 10)
        self.nth([0,0,1], 1, 100)
        self.nth([0,0,1], 3, 1)
        self.nth([0,0,1], 2, 10)
        self.nth([0,0,0], 1, 0)
        self.nth([0], 1, 0)