Pandas DataFrame constructor sorts rows, even with OrderedDict as input

111 views Asked by At

I create an OrderedDict:

from collections import OrderedDict

od = OrderedDict([((2, 9), 0.5218),
  ((2, 0), 0.3647),
  ((3, 15), 0.3640),
  ((3, 8), 0.3323),
  ((2, 28), 0.3310),
  ((2, 15), 0.3281),
  ((2, 10), 0.2938),
  ((3, 9), 0.2719)])

Then I feed that into the pandas DataFrame constructor:

import pandas as pd

df = pd.DataFrame({'values': od})

the result is this:

enter image description here

instead it should give this:

enter image description here

What is going on here that I don't understand?

P.S.: I am not looking for an alternative way to solving the problem (though you are welcome to post it if you think it would help the community). All I want is to understand why this here doesn't work. Is it a bug, or is there some logic to it? This is also not a duplicate of this link, because i am using specifically an OrderedDict and not a normal dict.

1

There are 1 answers

9
RichieV On

If you want to get the DataFrame in the same order as your dictionary you can

df = pd.DataFrame(od.values(), index=od.keys(), columns=['values'])

Output

      values
2 9   0.5218
  0   0.3647
3 15  0.3640
  8   0.3323
2 28  0.3310
  15  0.3281
  10  0.2938
3 9   0.2719

The only mention of OrderedDict in the frame source code is for an example of df.to_dict(), so not useful here.

It seems that even though you are passing an ordered structure, it is being parsed and re-ordered by default once you wrap it in a common dictionary {'values': od} and pandas takes its index from the OrderedDict.

This behavior seems to be overruled if you build your dictionary with the column labels as well (à la json).

od = OrderedDict([
    ((2, 9), {'values':0.5218}),
    ((2, 0), {'values':0.3647}),
    ((3, 15), {'values':0.3640}),
    ((3, 8), {'values':0.3323}),
    ((2, 28), {'values':0.3310}),
    ((2, 15), {'values':0.3281}),
    ((2, 10), {'values':0.2938}),
    ((3, 9), {'values':0.2719})
])
df = pd.DataFrame(od).T
print(df)
      values
2 9   0.5218
  0   0.3647
3 15  0.3640
  8   0.3323
2 28  0.3310
  15  0.3281
  10  0.2938
3 9   0.2719