How can I apply a sorted value counts to bins [cut() function]

27 views Asked by At

I had posted this question but was advised to restructure the question to make it clearer. First, this is a portion of the data that I am working with:

          Country                   Continent          % Renewable
0   China                      Asia              (15.754, 29.228]
1   United States              North America     (2.213, 15.754]
2   Japan                      Asia              (2.213, 15.754]
3   United Kingdom             Europe            (2.213, 15.754]
4   Russian Federation         Europe            (15.754, 29.228]

I need to return a series with a multiIndex that is grouped by Continent and in each Continent group lists each bin with a value of how many countries are in each bin. An example of what I need should look like this:

Asia      (2.213, 15.754]       3
          (15.754, 29.228]      1
          (29.228, 42.702]      2
          (56.176, 69.65]       0
          (42.702, 56.176]      0
Europe    (2.213, 15.754]       2
          (15.754, 29.228]      2
          (29.228, 42.702]      0
          (56.176, 69.65]       1
          (42.702, 56.176]      0
>>....and so on

This is the line of code I tried:

groups = renew.groupby(['Continent', pd.cut(renew['% Renewable'], 5)])

But...I get the following error:

TypeError: can only concatenate str (not "float") to str

I would like to create a Series with the index being structured as 'Continent' and then '% Renewable' with the data being the count of how many countries are in each bin per continent (even if the value is 0.

1

There are 1 answers

2
Andrej Kesely On

IIUC, you can use DataFrame.pivot_table:

renew["tmp"] = pd.cut(renew["% Renewable"], 5)
out = renew.pivot_table(index=["Continent", "tmp"], values=["Country"], aggfunc="count")

print(out)

Prints (using your limited input dataframe):

                              Country
Continent     tmp                    
Asia          (13.986, 16.8]        1
              (16.8, 19.6]          0
              (19.6, 22.4]          0
              (22.4, 25.2]          0
              (25.2, 28.0]          1
Europe        (13.986, 16.8]        1
              (16.8, 19.6]          0
              (19.6, 22.4]          0
              (22.4, 25.2]          0
              (25.2, 28.0]          1
North America (13.986, 16.8]        1
              (16.8, 19.6]          0
              (19.6, 22.4]          0
              (22.4, 25.2]          0
              (25.2, 28.0]          0