Pandas function operations

Question

Pandas function operations

9.4k views Asked by Vivek At 09 January 2017 at 14:28

Data is from the United States Census Bureau. Counties are political and geographic subdivisions of states in the United States. This dataset contains population data for counties and states in the US from 2010 to 2015.

Which state has the most counties in it? (hint: consider the sumlevel key carefully! You'll need this for future questions too...)

I can not fetch the county name out of the code. Please help

my code:

import pandas as pd
import numpy as np
census_df = pd.read_csv('census.csv')
census_df.head()
def answer_five():
    return census_df.groupby('STNAME').COUNTY.sum().max()



answer_five()

Original Q&A

There are 9 answers

**Aishwarya Kanchan** · Answer 1 · 2020-04-12T16:17:26+00:00

Aishwarya Kanchan On 12 April 2020 at 16:17

def answer_five():
    new_df = census_df[census_df['SUMLEV'] == 50]
    x = new_df.groupby('STNAME')
    return x.count()['COUNTY'].idxmax()


answer_five()

**dfadeeff** · Answer 2 · 2017-01-20T22:12:49+00:00

Here is the answer that worked for me:

def answer_five():
    return census_df.groupby(["STNAME"],sort=False).sum()["COUNTY"].idxmax()

First part created aggregated df

census_df.groupby(["STNAME"],sort=False).sum()

Second part takes the col you need

["COUNTY"].idxmax()

and returns value corresponding to index with max, check here

**Silvis Sora** · Answer 3 · 2019-03-24T02:46:00+00:00

Silvis Sora On 24 March 2019 at 02:46

Actually you can just count the number in states level instead of looking into County details.

And this should work:

census_df[census_df['SUMLEV']==50].groupby(['STNAME']).size().idxmax()

**Jay Mulani** · Answer 4 · 2020-06-09T13:33:55+00:00

Jay Mulani On 09 June 2020 at 13:33

import pandas as pd
def answer_five():
    df=census_df.groupby(['STNAME'])
    df=df.sum();
    fd=df['COUNTY'].max()
    df=df[df['COUNTY']==fd]
    return df.index[0]
answer_five()

**Anand Krishnan** · Answer 5 · 2019-02-19T11:59:24+00:00

We can also do this question using sum() function

def answer_five():
  return census_df.groupby(["STNAME"]).sum()["COUNTY"].idxmax()

Using sum() it will sum up all the values in COUNTY column from which we can apply idxmax() to find the the state which has the highest no:of counties.

**yogs** · Answer 6 · 2019-02-19T22:41:32+00:00

yogs On 19 February 2019 at 22:41


def answer_five():
    county = census_df[census_df['SUMLEV']==50]
    county = county.groupby(['STNAME']).count()

    return county['SUMLEV'].idxmax(axis=0)

answer_five()

**jasonlcy91** · Answer 7 · 2018-03-05T16:19:19+00:00

Just a correction to your entire code.

First, according to the source, SUMLEV of 50 means the row is a county. Two ways to answer this.

Thought process (think of it like in Excel): You want to count the number of "county rows" in each state group. First, you create the mask/condition to select all SUMLEV == 50 ("county rows"). Then group them by STNAME. Then use .size() to count the number of rows in each grouping.

# this is it!
def answer_five():
    mask = (census_df.SUMLEV == 50)
    max_index = census_df[mask].groupby('STNAME').size().idxmax()
    return max_index

# not so elegant
def answer_five():
    census_df['Counts'] = 1
    mask = (census_df.SUMLEV == 50)
    max_index = census_df[mask].groupby('STNAME')['Counts'].sum().idxmax()
    return max_index

You are welcome. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.size.html

**Nathan** · Answer 8 · 2018-02-25T18:52:53+00:00

Nathan On 25 February 2018 at 18:52

It's the change from .max() to idxmax() that returns the correct value for the STNAME rather than a large integer.

**Terk** · Answer 9 · 2017-03-25T21:21:22+00:00

Terk On 25 March 2017 at 21:21

def answer_five():
    return census_df.groupby('STNAME')['CTYNAME'].count().idxmax()

TechQA.

Pandas function operations

There are 9 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in PANDAS-DATAREADER

Popular Questions

Popular Tags

Trending Questions