How to do a binary search for a range of the same value?

3k views Asked by At

I have a sorted list of numbers and I need to get it return the range of index that the number appears. My list is:

daysSick = [0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5, 6, 6, 11, 15, 24]

If I searched 0, I need to return (0, 3). Right now I can only get it to find the spot of one number! I know how to do the binary search, but I am stuck how to make it move up and down from the position to find the other same values!

low = 0
high = len(daysSick) - 1
while low <= high :
    mid = (low + high) // 2
    if value < daysSick[mid]:
        high = mid - 1
    elif value > list[mid]:
        low = mid + 1
    else:
        return mid
4

There are 4 answers

5
Alexander McFarlane On BEST ANSWER

I present a solution faster than the raw functions taken from the bisect library

Solution

With Optimised Binary Search

def search(a, x):
    right = 0
    h = len(a)
    while right < h:
        m = (right+h)//2
        if x < a[m]: h = m
        else: 
            right = m+1
    # start binary search for left element only 
    # including elements from 0 to right-1 - much faster!
    left = 0
    h = right - 1
    while left < h:
        m = (left+h)//2
        if x > a[m]: left = m+1
        else: 
            h = m
    return left, right-1

search(daysSick, 5)
(10, 12)

search(daysSick, 2)
(5, 5)

Comparision vs. Bisect

  • Using customised binary search...

    %timeit search(daysSick, 3)
    1000000 loops, best of 3: 1.23 µs per loop
    
  • Copying the raw code from the source from bisect into python...

    %timeit bisect_left(daysSick, 1), bisect_right(daysSick, 1)
    1000000 loops, best of 3: 1.77 µs per loop
    
  • Using default import is by far the fastest as I think it might be optimised behind the scenes ...

    from bisect import bisect_left, bisect_right
    %timeit bisect_left(daysSick, 1), bisect_right(daysSick, 1)
    1000000 loops, best of 3: 504 ns per loop
    

Extra

Without ext. libraries but not binary search

daysSick = [0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5, 6, 6, 11, 15, 24]

# using a function
idxL = lambda val, lst:  [i for i,d in enumerate(lst) if d==val]

allVals = idxL(0,daysSick)
(0, 3)
4
wim On

You may use Python's bisection routines from the stdlib:

>>> daysSick = [0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5, 6, 6, 11, 15, 24]
>>> from bisect import bisect_left, bisect_right
>>> bisect_left(daysSick, 3)
6
>>> bisect_right(daysSick, 3)
9
>>> daysSick[6:9]
[3, 3, 3]
0
Shashank On

Ok, here's another way that works by attempting to reduce the range first before doing bisect_left and bisect_right on half of the already-reduced range. I wrote this code because I think it is slightly more efficient than just calling bisect_left and bisect_right even though it has the same time complexity.

def binary_range_search(s, x):
    # First we will reduce the low..high range if possible
    # by using symmetric binary search to find an index pointing to x
    low, high = 0, len(s)
    while True:
        if low >= high:
            return None
        mid = (low + high) // 2
        mid_element = s[mid]
        if x == mid_element:
            break
        elif x < mid_element:
            high = mid
        else:
            low = mid + 1
    xindex = mid

    # Now we have found an index pointing to x called xindex
    # and potentially reduced the low..high range
    # now we can run bisect_left on low..xindex + 1

    lo, hi = low, xindex + 1
    while lo < hi:
        mid = (lo+hi)//2
        if x > s[mid]: lo = mid+1
        else: hi = mid
    first = lo

    # and also bisect_right on xindex..high

    lo, hi = xindex, high
    while lo < hi:
        mid = (lo+hi)//2
        if x < s[mid]: hi = mid
        else: lo = mid+1
    last = lo - 1

    return first, last

I think the time complexity is O(log n) just like the trivial solution, but I believe this is a bit more efficient regardless. I think it's worth noting that the second part where you do bisect_left and bisect_right can be parallelized for large data sets since they are independent operations that do not interact.

0
nirmal kumar ravi On
  • Find floor : index of number < key
  • Find ceiling : index of number > key
  • [floor + 1 , ceiling - 1] gives you the range.
# yields next highest or lowest number to the key
# isLessThan determines which way the pointer moves

def nextNumber(arr, key, isLessThan):
  lo, hi = 0, len(arr)-1
  
  while lo <= hi:
    mid = lo + (hi - lo) // 2

    if isLessThan(key, arr[mid]):
      hi = mid - 1
    else:
      lo = mid + 1
  
  return (lo, hi)

def ceiling(arr, key):
  lo,_ = nextNumber(arr, key, lambda x,y : x < y)
  return lo

def floor(arr, key):
  _,hi = nextNumber(arr, key, lambda x,y : x <= y)
  return hi


def find_range(arr, key):
  fl = floor(arr,key)

  # key not in array
  if fl+1 >= len(arr) or arr[fl+1] != key:
    return [-1,-1]
  
  cl = ceiling(arr,key)
  return [fl+1 , cl-1]