I want to plot all columns in my dataframe against one column in the same df: totCost. The following code works fine:

for i in range(0, len(df.columns), 5):

Problem is output.png only contains the last 3 graphs (there are 18 total). Same happens if I de-dent that line. How do I write all 18 as a single graphic?

1 Answers

warped On

So, the problem with using pairplot like you do, is that in every iteration of the loop, a new figure is created and assigned to g.

If you take your last line of code g.savefig('output.png'), outside of the loop, only the last version of g is saved to disk, and this is the one with only the last three subplots in it.

If you put that line into you loop, all figures get saved to disk, but under the same name, and the last one is of course again the figure with three subplots in it.

A way around this is to create a figure, and assign all subplots to it, as they come, and then save that figure to disk:

import matplotlib.pyplot as plt

import pandas as pd
import numpy as np
import seaborn as sns

# generate random data, with 18 columns
dic = {str(a): np.random.randint(0,10,10) for a in range(18)}
df = pd.DataFrame(dic)

# rename first column of dataframe
df.rename(columns={'0':'totCost'}, inplace=True)

#instantiate figure
fig = plt.figure()

# loop through all columns, create subplots in 5 by 5 grid along the way,
# and add them to the figure
for i in range(len(df.columns)):
    ax = fig.add_subplot(5,5,i+1)
    ax.scatter(df['totCost'], df[df.columns[i]])