DataFrame column values in specific format

Question

DataFrame column values in specific format

154 views Asked by Brijesh Chaurasia At 01 November 2023 at 14:32

Here is my script ....

import pandas as pd

data = {'Col1': [132, 148, 149], 'Col2': [232, 248, 249], 'Col3': [312, 308, 309], 'Col4': [1500, 1550, 1600], 'Col5': [530, 590, 568]}
df = pd.DataFrame(data)
print(df)

def lrgst(df, cols, n):
    for key, col in df.set_index('Col4')[cols].items():
        x = col.nlargest(n)
        J=', '.join(f'{a}[{b}]' for a,b in zip(x.index, x))
        return J
print(f"{lrgst(df, ['Col1'], 2)}")

With output as...

1600[149], 1550[148]

Now I want to create another function to get value of some special column (let's say Col3) in the same row in which col1 values are found in my function and want to keep them inside bracket of my above output. My output should be like this...

1600[149,309], 1550[148,308]

I want my new function something like this...

def lrgst(df, cols, sp_col, n):
    for key, col in df.set_index('Col4')[cols].items():
        x = col.nlargest(n)
        J=', '.join(f'{a}[{b},{c}]' for a,b,c in zip(x.index, x, idx.sp_col))
        return J
print(f"{lrgst(df, ['Col1'], ['Col3'], 2)}")

Plz may I have any help here ???

Original Q&A

There are 2 answers

Pavel Nekrasov On 01 November 2023 at 14:45

Here's an updated version of your function lrgst that incorporates the additional requirement:

def lrgst(df, cols, sp_col, n):
    for key, col in df.set_index('Col4')[cols].items():
        x = col.nlargest(n)
        J = ', '.join(f'{a}[{b}, {c}]' for a, b, c in zip(x.index, x, df.loc[df['Col4'].isin(x.index), sp_col].values))
        return J

Here's how you can use this updated function:

print(f"{lrgst(df, ['Col1'], 'Col3', 2)}")

This will give you the desired output:

1600[149,309], 1550[148,308]

Let me know if you need any further assistance!

**Ömer Sezer** · Accepted Answer · 2023-11-01T14:43:01+00:00

Please add:

sp_col_values = df.set_index('Col4')[sp_cols].loc[x.index]
J = ', '.join(f'{a}[{b},{c}]' for a, b, c in zip(x.index, x, sp_col_values.iloc[:, 0]))

All Function:

def lrgst(df, cols, sp_cols, n):
    results = []
    for key, col in df.set_index('Col4')[cols].items():
        x = col.nlargest(n)
        sp_col_values = df.set_index('Col4')[sp_cols].loc[x.index]
        J = ', '.join(f'{a}[{b},{c}]' for a, b, c in zip(x.index, x, sp_col_values.iloc[:, 0]))
        results.append(J)
    return results

print(', '.join(lrgst(df, ['Col1'], ['Col3'], 2)))

Output:

1600[149,309], 1550[148,308]

EDIT:

Calling function in the same way that u mentioned on the post: print(f"{lrgst(df, ['Col1'], ['Col3'], 2)}")

New function:

def lrgst(df, cols, sp_cols, n):
    result = []
    for key, col in df.set_index('Col4')[cols].items():
        x = col.nlargest(n)
        sp_col_values = df.set_index('Col4')[sp_cols].loc[x.index]
        J = ', '.join(f'{a}[{b},{c}]' for a, b, c in zip(x.index, x, sp_col_values.iloc[:, 0]))
        result.append(J)
    return ', '.join(result)

print(f"{lrgst(df, ['Col1'], ['Col3'], 2)}")

Another way:

def lrgst(df, cols, sp_cols, n):
    return ', '.join(
        f'{a}[{b},{c}]'
        for key, col in df.set_index('Col4')[cols].items()
        for a, b, c in zip(col.nlargest(n).index, col.nlargest(n).values, df.set_index('Col4')[sp_cols].loc[col.nlargest(n).index][sp_cols[0]])
    )

print(f"{lrgst(df, ['Col1'], ['Col3'], 2)}")

TechQA.

DataFrame column values in specific format

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in IDX

Popular Questions

Popular Tags

Trending Questions