How to remove the innermost level of nesting in a list of lists of varying lengths

6.1k views Asked by At

I'm trying to remove the innermost nesting in a list of lists of single element length lists. Do you know a relatively easy way (converting to NumPy arrays is fine) to get from:

[[[1], [2], [3], [4], [5]], [[6], [7], [8]], [[11], [12]]]

to this?:

[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

Also, the real lists I'm trying to do this for contain datetime objects rather than ints in the example. And the initial collection of lists will be of varying lengths.

Alternatively, it would be fine if there were nans in the original list so that the length of each list is identical as long as the nans aren't present in the output list. i.e.

[[[1], [2], [3], [4], [5]], 
 [[6], [7], [8], [nan], [nan]], 
 [[11], [12], [nan], [nan], [nan]]]

to this:

[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
7

There are 7 answers

0
juanpa.arrivillaga On BEST ANSWER

If the nesting is always consistent, then this is trivial:

In [2]: import itertools

In [3]: nested = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]

In [4]: unested = [list(itertools.chain(*sub)) for sub in nested]

In [5]: unested
Out[5]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

Note, the solutions that leverage add with lists are going to give you O(n^2) performance where n is the number of sub-sublists that are being merged within each sublist.

0
piRSquared On

Because this question looks fun!
I used a recursive function that unpacks a list if it only has one value.

def make_singular(l):
    try:
        if len(l) == 1:
            return l[0]
        else:
            return [make_singular(l_) for l_ in l]
    except:
        return l

nest = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
make_singular(nest)

[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
0
Yevhen Kuzmovych On

Try this:

l = [ [ [1],[2],[3],[4],[5] ],
      [ [6],[7],[8], [None],[None]] ,
      [ [11],[12],[None],[None],[None]] ]

l = [ [x[0] for x in s if x[0] is not None] for s in l]
0
kmaork On
>>> from operator import add
>>> lists = [ [ [1],[2],[3],[4], [5] ],   [ [6],[7],[8] ] , [ [11],[12] ] ]
>>> [reduce(add, lst) for lst in lists]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

This is not a very efficient, as it rebuilds a list each time add is called. Alternatively you can use sum or a simple list comprehension, as seen in the other answers.

2
crackpotHouseplant On

How about np.squeeze?

Remove single-dimensional entries from the shape of an array.

arr = [ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
>>> arr
[[[1], [2], [3], [4], [5]], [[6], [7], [8]], [[11], [12]]]
>>> [np.squeeze(i) for i in arr]
[array([1, 2, 3, 4, 5]), array([6, 7, 8]), array([11, 12])]

Not necessarily the innermost (ie independent of how many dimensions) dimension though. But your question specifies "list of lists"

0
Moinuddin Quadri On

As in your case, innermost object has just one element. You may access the value based on index instead of using some additional function. For example:

>>> [[y[0] for y in x] for x in my_list]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

If there is possibility that your inner-most list may have more than one element, you may do:

>>> [[z for y in x for z in y] for x in my_list]
[[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]
0
hpaulj On

If you know the level of nesting then one of the list comprehensions is easy.

In [129]: ll=[ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [ [11],[12] ] ]
In [130]: [[j[0] for j in i] for i in ll]        # simplest
Out[130]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

If the criteria is just to remove an inner layer of nesting, regardless of how deep it occurs, the code will require more thought. I'd probably try to write it as a recursive function.

The np.nan (or None) padding doesn't help with the list version

In [131]: lln=[ [ [1],[2],[3],[4],[5] ], [ [6],[7],[8],[nan],[nan]] , [ [11],[12],[nan],[nan],[nan] ] ]
In [132]: [[j[0] for j in i if j[0] is not np.nan] for i in lln]
Out[132]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

The padding does let us make a 3d array, which can then easily be squeezed:

In [135]: arr = np.array(lln)
In [136]: arr.shape
Out[136]: (3, 5, 1)
In [137]: arr = arr[:,:,0]
In [138]: arr
Out[138]: 
array([[  1.,   2.,   3.,   4.,   5.],
       [  6.,   7.,   8.,  nan,  nan],
       [ 11.,  12.,  nan,  nan,  nan]])

but then there's a question of how to remove those nan and create ragged sublists.

Masked arrays might let you work with a 2d array without being bothered by these nan:

In [141]: M = np.ma.masked_invalid(arr)
In [142]: M
Out[142]: 
masked_array(data =
 [[1.0 2.0 3.0 4.0 5.0]
 [6.0 7.0 8.0 -- --]
 [11.0 12.0 -- -- --]],
             mask =
 [[False False False False False]
 [False False False  True  True]
 [False False  True  True  True]],
       fill_value = 1e+20)
In [144]: M.sum(axis=1)      # e.g. sublist sums
Out[144]: 
masked_array(data = [15.0 21.0 23.0],
             mask = [False False False],
       fill_value = 1e+20)

Removing the nan from arr is probably easiest with a list comprehension. The values are float because np.nan is float.

In [153]: [[i for i in row if ~np.isnan(i)] for row in arr]
Out[153]: [[1.0, 2.0, 3.0, 4.0, 5.0], [6.0, 7.0, 8.0], [11.0, 12.0]]

So the padding doesn't help.

If the padding was with None, then the array would be object dtype, which is closer to a nested list in character.

In [163]: lln
Out[163]: 
[[[1], [2], [3], [4], [5]],
 [[6], [7], [8], [None], [None]],
 [[11], [12], [None], [None], [None]]]
In [164]: arr=np.array(lln)[:,:,0]
In [165]: arr
Out[165]: 
array([[1, 2, 3, 4, 5],
       [6, 7, 8, None, None],
       [11, 12, None, None, None]], dtype=object)
In [166]: [[i for i in row if i is not None] for row in arr]
Out[166]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

Another array approach is to count the number of valid elements at the 2nd level; flatten the whole thing, and then split.

A recursive function:

def foo(alist):
    if len(alist)==1:
        return alist[0]
    else:
        return [foo(i) for i in alist if foo(i) is not None]

In [200]: ll=[ [ [1],[2],[3],[4], [5] ], [ [6],[7],[8] ] , [11], [[[12],[13]]]] 
In [201]: foo(ll)
Out[201]: [[1, 2, 3, 4, 5], [6, 7, 8], 11, [[12], [13]]]
In [202]: lln=[ [ [1],[2],[3],[4],[5] ], [ [6],[7],[8],[None],[None]] , [ [11],[12],[None],[None],[None] ] ]
In [203]: foo(lln)
Out[203]: [[1, 2, 3, 4, 5], [6, 7, 8], [11, 12]]

It recurses down to the level where lists have length 1. It is still fragile, and misbehaves if the nesting levels vary. Conceptually it's quite similar to @piRSquared's answer.