I need to make a pandas DataFrame that has a column filled with hyphenated numbers. The only way I could think of to do this was to use strings. This all worked fine, until I needed to sort them to get them back into order after a regrouping. The problem is that strings sort like this:
['100-200','1000-1100','1100-1200','200-300']
This is clearly not how I want it sorted. I want it sorted numberically. How would I get this to work? I am willing to change anything. Keeping the hyphenated string as an integer or float would be the best, but I am unsure how to do that.
You could use
sorted
to construct a new ordering for the index, and then perform the sort (reordering) usingdf.take
:yields
This is similar to @275365's solution, but note that the sorting is done on
range(len(df))
, not on the strings. The strings are only used in thekey
parameter to determine the order in whichrange(len(df))
should be rearranged.Using
sorted
works fine if the DataFrame is small. You can get better performance when theDataFrame
is of moderate size (for example, a few hundred rows on my machine), by usingnumpy.argsort
instead:Alternatively, you could split your string column into two integer-valued columns, and then use
df.sort
:yields