#!/usr/bin/env python


def extractIntervals(new_trace):
    for i in range(0,len(new_trace)-1,1):
        if new_trace[i] in listofAppearances[0]:

    for j in range(len(new_trace)-1,0,-1):
        for k in range(0,len(listofAppearances[0])-1,1):
            if (new_trace[j]==listofAppearances[0][k]) and (listofAppearances[2][k]==-1):


def main():

if __name__ == "__main__":

In my code above I am trying to extract Intervals of appearance (delimited by 1st and last appearance indexes of every number in the list) The way I do it is I parse once the list and if the number still doesn't exist in listOfAppearances, then I append it to first column, the index to the second column and I set the 3rd column to -1.

I parse again the list in reverse every element is look for in the listofAppearances and the corresponding 3rd column is changed to the current index if still set to -1.

This works, but the first iteration when parsing the list backward has some issue that I can't figure out. the result I have with this example of a list is:

[[1, 2, 3, 4, 5, 6, 7], [0, 1, 3, 6, 13, 14, 16], [-1, -1, -1, -1, -1, -1, -1]]
[[1, 2, 3, 4, 5, 6, 7], [0, 1, 3, 6, 13, 14, 16], [9, 8, 12, 27, 37, 36, -1]]

As you can see the last element of the second list is still set to -1 which I don't understand why! I inspected every inch of the code and I can't see why is it this way!

2 Answers

Enzo On

Just change

for k in range(0, len(listofAppearances[0])-1, 1):


for k in range(0, len(listofAppearances[0]), 1):

in line 17.

Edit: you can get the same result by:

def extractIntervals(new_trace):
    listofAppearances = [0, 0, 0]
    listofAppearances[0] = list(set(new_trace))
    # returns new_trace without repeated elements

    listofAppearances[1] = [new_trace.index(i) for i in list(set(new_trace))]
    # returns a list with the index of the first occurrence
    # in new_trace of each element in list(set(new_trace))

    listofAppearances[2] = [len(new_trace) - 1 - new_trace[::-1].index(i) for i in list(set(new_trace))]
    # returns a list with the index of the last occurrence
    # in new_trace of each element in list(set(new_trace))

chepner On

Might I suggest processing a stream of values? First define a few helper functions, then use them to group each element with the positions at which it occurs.

from itertools import groupby
from operator import itemgetter

second = itemgetter(1)
first_and_last = itemgetter(0, -1)

def sort_and_group(seq, k):
    return groupby(sorted(seq, key=k), k)

def extract_intervals(new_trace):
    tmp1 = sort_and_group(enumerate(new_trace), second)
    tmp2 = [(val, *first_and_last([x for x,_ in positions])) for val, positions in tmp1]
    return zip(*tmp2)



tmp1 is a pairing of each element with the list of positions at which it occurs.

tmp2 is a list of triples, consisting of a list element and the first and last position at which it occurs.

The call to zip "unzips" the list of triples into three tuples: the elements, the first positions, and the last positions.