Python pandas if column value is list then create new column(s) with individual list value

230 views Asked by At

I'm using pandas to create a dataframe from a SaaS REST API json response and hitting a minor blocker to cleanse the data for visualization and analysis.

I need to tweak the python script by adding a conditional function to say if the value is in a list then remove the brackets, separate the values into new columns and name the new columns as [original column name + value list order].

In the similar questions posted the function is performed on a specified column whereas I need the check to be run on all 1,400+ columns in the dataframe. Basically, excel text to columns and the column header name is [original column name + value list order]

Current enter image description here

Need enter image description here

Here's the dataframe creation script from the .json response

def get_tap_dashboard():
    use_fields = ''
    for index, value in enumerate(list(WORKFLOW_FIELDS.keys())):
        if index != len(list(WORKFLOW_FIELDS.keys())) - 1:
            use_fields = use_fields + value + ','
        else:
            use_fields = use_fields + value
    dashboard_head = {'Authorization': 'Bearer {}'.format(get_tap_token()), 'Content-Type': 'application/json'}
    dashboard_url = \
        TAP_URL + "api/v1/workflows/all?pageSize={}&page=1".format(SIZE) \
        + "&advancedFilter=__WorkflowDescription__~eq~'{}'".format(WORKFLOW_NAME) \
        + "&configurationId={}".format("1128443a-f7a7-4a90-953d-c095752a97a2")
    dashboard = json.loads(requests.get(url=dashboard_url, headers=dashboard_head).text)

    all_columns = []
    for col in dashboard['Items'][0]['Columns']:
        all_columns.append(col['Name'])
    all_columns = ['ResultSetId'] + all_columns
    pd_dashboard = pd.DataFrame(columns=all_columns)

    for row in dashboard['Items']:
        add_row_values = [row['ResultSetId']]
        for col in row['Columns']:
            if col['Value'] == '-- Select One --':  # dtype issue
                add_row_values.append([''])
            else:
                add_row_values.append(col['Value'])
        add_row_df = pd.DataFrame([add_row_values], columns=all_columns)
        pd_dashboard = pd_dashboard.append(add_row_df)
    tap_dashboard = pd_dashboard
    return tap_dashboard.rename(columns=WORKFLOW_FIELDS).reset_index(drop=True)

df = get_tap_dashboard()

Any help would be much appreciated thanks all!

PS - I have a Tableau creator license if it makes more sense to do it in Tableau/Tableau prep builder

1

There are 1 answers

0
mccandar On

Is this could be what you need?

from collections import defaultdict
output = defaultdict(lambda : [])
def count(x):
    if isinstance(x,list):
        if len(x) > 1:  
            for i,item in enumerate(x):
                output[f'{item}_{i}'].append(item)
        elif len(x) == 1:
            output[f'{x[0]}_0'].append(x[0])
df['df_column_name'].apply(count)

print(pd.DataFrame.from_dict(output, orient='index').T)