Matplotlib: Legend for marker and color in a scatterplot

56 views Asked by At

I am trying to plot a scatterplot for the data frame(posted below). I want to represent the scatterplot with the different stages being shown as different markers(circle,square etc) and different products being shown as different colors(red,blue etc). so far i have done that but i have a hard time showing a legend that depicts this.

this is what i wrote:

df = pd.DataFrame([[1500,24,'open','drive'],
                   [2900, 30, 'open', 'walk'],
                   [1200, 50, 'closed', 'drive'],
                   [4000, 80, 'open', 'air'],
                   [8000, 70, 'ongoing', 'air'],
                   [6100, 40, 'ongoing', 'walk'],
                   [7200, 85, 'closed', 'drive'],
                   [3300, 25, 'closed', 'drive'],
                   [5400, 45, 'open', 'walk'],
                   [5900, 53, 'open', 'air']])
df.columns = ['Cost','Duration','Stage','Product']

label_encoder = LabelEncoder()
markers = {0: 'o', 1: 's', 2: '^'}
df['Product_encoded'] = label_encoder.fit_transform(df['Product'])
df['Stage_encoded'] = label_encoder.fit_transform(df['Stage'])
df['Stage_encoded']= df['Stage_encoded'].map(markers)
colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')

X= np.array(df)
for idx,cl in enumerate(np.unique(df['Stage_encoded'])):
    plt.scatter(x=X[df['Stage_encoded']== cl,0],y=X[df['Stage_encoded']== cl,1],marker=cl,c=[colors[i] for i in X[df['Stage_encoded'] == cl, 4]])
    plt.legend()

this shows the plot(image below) and gives the point the appropriate color and marker but i want to show the legend(marker and color)

My code gives this output

1

There are 1 answers

1
Tino D On

I adjusted the for loop to make it a bit more easy to handle:

for stage in df["Stage"].unique(): # for every unique stage
    subD = df[df["Stage"]==stage] # get the data for that specific case
    plt.scatter(x=subD["Cost"], # cost on x axis
                y=subD["Duration"], # duration on y aaxis
                marker=subD["Stage_encoded"].unique()[0], # marker from what you defined
                # color will be automatically changing, no need to specify it in this case
                label=subD['Stage_encoded'].unique()[0]) # add a label!
plt.legend() # legend on the outside! otherwise it will turn on or off each loop!

Here's the plot, with the legend ;)

scatter