Finding highest value/lowest value in local clusters of CSV data

27 views Asked by At

The current challenge I am working through is as follows :

I am looking to find patterns in CSV data in a list that is updated by the minute. I need specific patterns found such as

EX(Left to right) : (100-125-150-120-135-160-180-145-165-185-200-190-180)

The data went from 100 to 125 to 150. Because it then went to 120 I would need to log 150 as the highest value in that local cluster and archive that number for later. Because after 120 it went to 135, then 160, then 180 before falling to 145, I would like to then again log 180 as the highest for that specific cluster while retaining the 150 from earlier. If this process repeats it would then log 200 as well, because that is the highest number in the local cluster before the values went below the highest again

I'm unsure of precise methods to be used so I have yet to reach coding it

The summation of techniques ive found through search include machine learning models, but its my understanding that Machine Learning is not needed in such a simple process. Pure logic should be appropriate

1

There are 1 answers

0
Waqar On

Yes, you are right that Machine learning would likely be overkill and for this pattern a simple for loop with if statements would be enough.. below is the psudocode for all this, and also the implementation in python language(the logic is almost the same for any other language)

psudocode:

Initialize variables:
highs = empty list
current_value = None
potential_high = None

We iterate through each value in the input data. For the first value, we initialize current_value and potential_high. assuming all the values are contained inside data(that can be our csv file)

FOR each value in data:

        IF current_value is None: 
            # First value in the data 
            Set current_value = value
            Set potential_high = value

If the current value is larger than the potential_high, we've found a new potential high point.

  ELSE IF value > potential_high:
                 # Value is increasing, update the potential high
                 Set potential_high = value

If the current value is less than or equal to the potential_high, we've encountered a decrease, so we log the potential_high and start a new cluster.

  ELSE IF value <= potential_high:
                # Value is decreasing, check for a unique local high
                IF potential_high NOT IN highs: 
                     Append potential_high to the highs list
                 Reset potential_high = current_value

update position

 Set current_value = value

Implementation in python is below for the same data that you gave that is [100, 125, 150, 120, 135, 160, 180, 145, 165, 185, 200, 190, 180] seperated by commas "," or csv

highs = []
current_value = None
potential_high = None
data = [100, 125, 150, 120, 135, 160, 180, 145, 165, 185, 200, 190, 180]


for value in data:
  if current_value is None:
    # Initialize on the first value
    current_value = value
    potential_high = value
  elif value > potential_high:
    # Update potential high if value greater
    potential_high = value
  elif value <= potential_high and potential_high not in highs:
    # Add potential high to highs if it's unique and less than or equal to current high
    highs.append(potential_high)
  # Update current value for next iteration
  current_value = value


print("Unique Local Highs:", highs)

OUTPUT: Unique Local Highs: [150, 180, 200]

Hope this helps!