# Longest Consecutive Sequence

October 16, 2019

Given an array of inte­gers, find the length of the longest sequence that con­tains con­sec­u­tive inte­gers.

The array is unsort­ed, and may con­tain dupli­cates and neg­a­tive inte­gers.

For exam­ple: the array `[100, 5, 1001, 4, 7, 6]` should return the answer 4, since the longest sequence is `[4, 5, 6, 7]`.

This is an exam­ple of a book­keep­ing prob­lem - there isn’t any­thing spe­cial to do except to keep track of the posi­tions of cur­rent sequences.

Con­sid­er the ele­ments as laid out on the inte­ger number line. As we con­sid­er each ele­ment from left to right, imag­ine mark­ing each ele­men­t’s posi­tion on the line. Note that we can ignore dupli­cates with­out loss of gen­er­al­i­ty, since dupli­cates do not con­tribute to the final answer.

There are four cases to con­sid­er. When mark­ing the cur­rent ele­ment on the number line:

1. nei­ther its left or right neigh­bour has been marked.
2. its right neigh­bour has been marked, but not its left neigh­bour.
3. its left neigh­bour has been marked, but not its right neigh­bour.
4. both neigh­bours have been marked.

In cases 2 and 3, the cur­rent ele­ment extends an exist­ing sequence by 1. In case 4, the cur­rent ele­ment con­nects two exist­ing sequences togeth­er.

``````Case 2: considering 2, while 3 and 4 have been marked previously
* ~ ~
1 2 3 4

Case 3: considering 3, while 1 and 2 have been marked previously
~ ~ *
1 2 3 4

Case 4: considering 3, while 1, 2, and 4 have been marked previously
~ ~ * ~
1 2 3 4``````

We main­tain a map whose keys are the inte­gers to be con­sid­ered, and a 2-tuple con­tain­ing the start­ing and ending inte­gers of a sequence. Since we are only con­sid­er­ing con­sec­u­tive sequences, we can infer the size of a sequence by its start and end.

``````def longest_consecutive_sequence(nums):
dic = {}
output = 0
for i in range(len(nums)):
curr = nums[i]
# ignore duplicates
if curr in dic:
continue
# case 1
if curr - 1 not in dic and curr + 1 not in dic:
dic[curr] = [curr, curr]
# case 2
elif curr - 1 not in dic:
right = dic[curr + 1]
dic[curr] = [curr, right]
right_edge = dic[right]
right_edge = curr
# case 3
elif curr + 1 not in dic:
left = dic[curr - 1]
dic[curr] = [left, curr]
left_edge = dic[left]
left_edge = curr
# case 4
else:
left = dic[curr - 1]
right = dic[curr + 1]
dic[curr] = [left, right]

left_edge = dic[left]
# we update the leftmost entry's ending integer
# to be the right neighbour's ending integer
left_edge = dic[curr]

right_edge = dic[right]
# we update the rightmost entry's starting integer
# to be the left neighbour's starting integer
right_edge = dic[curr]

# update the length of the longest sequence
# the length is just the end - start + 1, since it's consecutive
output = max(output, dic[curr] - dic[curr] + 1)
return output``````

`O(n)` in both time and space.

Alter­na­tive­ly, one can also solve it by sort­ing the array and main­tain­ing a con­stant number of vari­ables to keep track of the longest sequence seen so far in one pass from left to right. That will be `O(nlogn)` in time and `O(1)` space.