How to highlight non-main diagonal elements

95 views Asked by At

I was wondering how to highlight diagonal elements of pandas DataFrame using df.style methods.

I already found out how to do it with the main diagonal, but can't manage to highlight the one which starts from the second column, f.e.

import numpy as np
import pandas as pd

df = pd.DataFrame({'a':[1,2,3,4],'b':[1,3,5,7],'c':[1,4,7,10],'d':[1,5,9,11]})

def style_diag(data):
    diag_mask = pd.DataFrame("", index=df.index, columns=df.columns)
    min_axis = min(diag_mask.shape)
    diag_mask.iloc[range(min_axis), range(min_axis)] = 'background-color: yellow'
    return diag_mask

df.style.apply(style_diag, axis=None)

This gives following output:

table with highlighting along the main diagonal

(but actually I don't really get the magic in this function)

And I'd like to have a yellow highlight across the diagonal elements 1, 4, 9.

How can I do that?

2

There are 2 answers

0
Henry Ecker On BEST ANSWER

There are certainly more than a few options here depending on the exact needs. One approach would be to create a mask of the same shape as your DataFrame with the diagonals at the desired offset filled with Trues to conditionally apply styles.

The approach and usage

def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
    # Create empty styles DataFrame
    style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)

    # Create a 2D False mask
    mask = np.zeros(df_.shape, dtype=bool)

    # Find diagonal indices at an offset and replace values with True
    rows, cols = np.indices(mask.shape)
    mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True

    # Set diagonal styles using mask
    style_df[mask] = 'background-color:yellow'
    return style_df

This can be used like:

df.style.apply(style_diag, offset=1, axis=None)

Which produces the following results:

Styled DataFrame with offset 1 diagonal highlighted yellow

Similarly this can be used without an offset to produce the original output:

df.style.apply(style_diag, axis=None)

Styled DataFrame with main diagonal highlighted yellow

Or even with negative offsets:

df.style.apply(style_diag, offset=-2, axis=None)

Styled DataFrame with offset -2 diagonal highlighted yellow

How it works

We start with an empty False mask of the same shape as our DataFrame:

mask = np.zeros(df_.shape, dtype=bool)

# array([[False, False, False, False],
#        [False, False, False, False],
#        [False, False, False, False],
#        [False, False, False, False]])

From here we need to find the diagonal indices in order to replace the values on the diagonal with True. There is a function np.diag_indices_from, however, unfortunately this does not directly support offset diagonals.

Let's instead grab the indices for this mask using np.indices

rows, cols = np.indices(mask.shape)

# rows
# array([[0, 0, 0, 0],
#        [1, 1, 1, 1],
#        [2, 2, 2, 2],
#        [3, 3, 3, 3]])
# cols
# array([[0, 1, 2, 3],
#        [0, 1, 2, 3],
#        [0, 1, 2, 3],
#        [0, 1, 2, 3]])

We can now use the np.diag function on both rows and cols which does natively support offsets (k). (For this example, offset is 1)

np.diag(rows, k=offset)
# array([0, 1, 2])

np.diag(cols, k=offset)
# array([1, 2, 3])

We can use the results from diag as indexers to update our mask

mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True

# array([[False,  True, False, False],
#        [False, False,  True, False],
#        [False, False, False,  True],
#        [False, False, False, False]])

Now we have a well formatted mask that can be used easily apply style strings.

style_df[mask] = 'background-color:yellow'

#   a                        b                        c                        d
# 0    background-color:yellow                                                  
# 1                             background-color:yellow                         
# 2                                                      background-color:yellow
# 3                                                                             

Complete working example with imports and version numbers used

import numpy as np  # v1.26.2
import pandas as pd  # v2.1.4

df = pd.DataFrame({
    'a': [1, 2, 3, 4],
    'b': [1, 3, 5, 7],
    'c': [1, 4, 7, 10],
    'd': [1, 5, 9, 11]
})


def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
    style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)
    mask = np.zeros(df_.shape, dtype=bool)
    rows, cols = np.indices(mask.shape)
    mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
    style_df[mask] = 'background-color:yellow'
    return style_df


df.style.apply(style_diag, offset=1, axis=None)

Styled DataFrame with offset 1 diagonal highlighted yellow

2
wjandrea On

You could use math to calculate the indexes to highlight, but I think it's easier to just .shift().

df.style.apply(lambda d: style_diag(d).shift(axis=1), axis=None)

table with highlighting along the diagonal starting from the second column