k-th Largest Number from a List of Digits

November 12, 2019

Given a list of digits (pos­si­bly con­tain­ing dupli­cates) from 0-9, and a number k, find the k-th largest number that can be formed from these digits. Assume that k-th largest number can be formed.

When han­dling 0s, treat lead­ing zeroes as ignored. E.g. for [0,1] and k = 1, the answer is 10, and for k = 2, the answer is 1.

My idea was to think of it as a deci­sion tree. Each node in the tree rep­re­sents a choice of pick­ing a par­tic­u­lar unique digit. Each node also has an ordered set of chil­dren, which rep­re­sents the remain­ing unique digits in the list, ordered from small­est to largest. Then, the leaves of this tree, when viewed from left to right, rep­re­sent the pos­si­ble num­bers, from small­est to largest. We could tra­verse this tree and find the cor­rect number.

We can make use of some basic math to prune the tree. What we know is that for a node, we can count the number of leaves of the sub­tree rooted at that node. The number of leaves is equiv­a­lent to the number of per­mu­ta­tions of the remain­ing digits at that node. For exam­ple, for the remain­ing digits 1,1,2, we know that the number of per­mu­ta­tions is given by:

factorial(number of digits) divided by (product(factorial(cardinality of each number)))

So, in this case, the number is 3! divid­ed by ((2 = number of 1s)! * (1 = number of 2s)!) = 6 / 2 = 3. This cor­re­sponds to 112, 121, and, 211.

So, for a given k and at a given node, we can cal­cu­late the size of the sub­trees of each child from right to left, and skip over the chil­dren until we find the cor­rect child that k falls in. We then recurse until we reach the end of the tree (given by the base case of len(digits) == 1).

No spe­cial han­dling is needed for lead­ing zeroes. We simple pre­serve the result as strings until we are ready to con­vert the string into an inte­ger, at the top of the recur­sion. Python will han­dling lead­ing zeroes cor­rect­ly for us. This is why I have a sep­a­rate nth driver func­tion that is used for recur­sion.

It’s not the most effi­cient imple­men­ta­tion ever, but I feel it’s more con­ve­nient and read­able. Time com­plex­i­ty is dom­i­nat­ed by the recur­sion and the pro­cess­ing at each level, which involves sort­ing and cal­cu­lat­ing the per­mu­ta­tions. At each of the n levels of recur­sion, we need to sort, and the worst case is that we need to cal­cu­late the per­mu­ta­tions for every single child (this patho­log­i­cal case hap­pens when we are look­ing for the small­est number), so the time com­plex­i­ty at each level is O(nlogn + n2). The over­all com­plex­i­ty is thus O(n2logn + n3).

from typing import List
from collections import Counter
from math import factorial
from functools import reduce
from operator import mul

def permutations(c: Counter) -> int:
    Given a list of digits, return the number of possible permutations.
    values = c.values()
    return factorial(sum(values)) / (reduce(mul, [factorial(x) for x in values]))

def nth(digits: List[int], k: int) -> str:
    # base case
    if len(digits) == 1:
        return str(digits[0])
    # get a list of unique digits, sorted in descending order
    unique_numbers = sorted(list(set(digits)), reverse=True)
    c = Counter(digits)
    # this pointer refers to the current digit we are taking out
    p = -1
    while True:
        p += 1
        # this has the effect of "taking out" a digit from the list of digits
        c[unique_numbers[p]] -= 1
        # check how many permutations the rest of the digits have
        current_permutations = permutations(c)
        if current_permutations >= k:
        # iterate until we reach a "bucket" where the kth number falls into
            k -= current_permutations
            # put the current digit back
            c[unique_numbers[p]] += 1
    # once identified, recurse into the bucket
    rest = nth(list(c.elements()), k)
    return str(unique_numbers[p]) + rest

def nth_largest_number(digits: List[int], k: int) -> int:
    return int(nth(digits, k))

Here are some test cases that I’ve tested:

class KthLargestNumber(unittest.TestCase):
    def nth(self, x, y, z):
        self.assertEqual(nth_largest_number(x,y), z)

    def test_sanity(self):
        self.nth([1,2], 2, 12)
        self.nth([1,2], 1, 21)
        self.nth([1,2,3], 1, 321)
        self.nth([1,2,3], 2, 312)
        self.nth([1,2,3], 6, 123)
        self.nth([1,2,3,4], 18, 2134)
        self.nth([1,2,3,4], 24, 1234)
        self.nth([0,1], 2, 1)
        self.nth([0,1], 1, 10)
        self.nth([0,0,1], 1, 100)
        self.nth([0,0,1], 3, 1)
        self.nth([0,0,1], 2, 10)
        self.nth([0,0,0], 1, 0)
        self.nth([0], 1, 0)