Python Loop to generate multiple pivot tables from one pandas dataframe

83 views Asked by cMX5 At 19 January 2024 at 06:41

I've selected a set of records from a larger data set that represents a specific organization. I need to iterate over this data and produce a pivot table that compares the demographic information to how that demographic answered the survey questions. There are 5 demographic questions, each with multiple choices for categorization. A pivot table provides a count of how each demographic category answered a survey question. I need to produce 5 pivot tables for each of the remaining 17 questions and store them to a dataframe. I'll stack the resulting dataframes later. The problem I'm having is trying to iterate over the main original dataframe, replacing the two column names for the crosstab. I can easily do one manually with the following code:

table18 = pd.pivot_table(org1_df, values='Business size?', index=['Business type?'], 
                              columns=['Survey Question 1'], 
                              aggfunc=['count'], fill_value=0)

cross = pd.crosstab(org1_df["Business type?"], org1_df["Survey Question 2"])
cross

Here's the pivot table I was able to create:

The point is to determine how a small business answered the survey questions compared with a large business' answers. And so on for the other dozen+ categories we have on this survey.`

Respondents	Business Type?	Business Size?	Survey Question 1	Survey Question 2
Resp_1	2	2	3	2
Resp_2	2	1	4	3

Sample data above.

I stored the demographic column names in a list:

column_names = ["Business type?", "Business size?"]

then added those to a dictionary:

d = {}
for name in column_names:
    d[name] = pd.DataFrame()

I tried a dictionary comprehension based on a similar question and answer:

dfs = {i[1]: pd.pivot_table(org1_df, index=i, values='Business type?', 
                            aggfunc='count', fill_value=0) for i, org1_df in zip(column_names, d)}

I received this error: AttributeError: 'str' object has no attribute 'columns'

I would greatly appreciate help on this task. As an aside, I'm not sure why when I leave values='Business type?' it doesn't matter for the resulting pivot table. The answers I'm getting matched what I get in Excel.

I'm working my way to a diverging stacked bar chart using matplotlib, but I have to get the data organized first.

UPDATE: org1_df output Based on the mock data above, I would produce 4 pivot tables from that.

Survey Question 1 - a count of the answers for "Business type?"
Survey Question 1 - a count of the answers for "Business size?"
Survey Question 2 - a count of the answers for "Business type?"
Survey Question 4 - a count of the answers for "Business size?"

Original Q&A

TechQA.

Python Loop to generate multiple pivot tables from one pandas dataframe

There are 0 answers

Related Questions in PYTHON

Related Questions in LOOPS

Related Questions in PIVOT-TABLE

Related Questions in SURVEYMONKEY

Popular Questions

Trending Questions