I want to perform a 3d scatterplot with a dataframe, which has the following format:
df = pd.DataFrame({"Date": ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
"A_x1": [1, 2, 2, 2],
"A_x2": [9, 2, 2, 3],
"A_x3": [1, 3, 2, 9],
"B_x1": [1, 8, 2, 3],
"B_x2": [3, 8, 9, 3],
"B_x3": [2, 4, 5, 5],
"C_x1": [2, 6, 5, 2],
"C_x2": [4, 8, 1, 3],
"C_x3": [6, 9, 5, 7]})
Date | A_x1 | A_x2 | A_x3 | B_x1 | B_x2 | B_x3 | C_x1 | C_x2 | C_x3 | D_x1 |
---|---|---|---|---|---|---|---|---|---|---|
2021-01-01 | 1 | 9 | 1 | 1 | 3 | 2 | 2 | 4 | 6 | ... |
2021-01-02 | 2 | 2 | 3 | 8 | 8 | 4 | 6 | 8 | 9 | ... |
2021-01-03 | 2 | 2 | 2 | 2 | 9 | 5 | 5 | 1 | 5 | ... |
2021-01-04 | 2 | 3 | 9 | 3 | 3 | 5 | 2 | 3 | 7 | ... |
As you could guess: The 3 axis of the 3d Scatterplot shall be x1, x2 and x3. So I have 3 variables for 3 axis, but multiple values for each row. I want to plot the values of A_x1/2/3, B_x1/2/3 etc. to the respective point and color them (f. ex. A = red, B = green, C = blue etc.).
I tried to use matplotlib and plotly but I'm open to any other libraries. To get an dataframe or array for all x_1 values I use the following code.
df_x_1 = df.filter(like='1') #df x_1
x_1 = df_x_1.to_numpy() #arr_x_1
This is the simpliest scatterplot in plotly, works fine:
import plotly.express as px
fig = px.scatter_3d(df,
x='A_x1',
y='A_x2',
z='A_x3',
#color='species'
)
fig.show()
Part of the problem, which has been solved by @Ynjxsjmh spoilered:
But this obv. plots the x1, x2, x3 values for A (=3 columns), I want all >!columns to be included. I want to do something like this but I get different errors. Tried with >!dataframe and arrays. code
fig = px.scatter_3d(x=df.filter(like='1').values.ravel('F'),
y=df.filter(like='2').values.ravel('F'),
z=df.filter(like='3').values.ravel('F'),
color = ( df.filter(like='3').values.ravel('F')*df.filter(like='2').values.ravel('F')*df.filter(like='1').values.ravel('F') )**(1/3)
)
fig.show()
This code works now. The datapoints (f. ex. A_x1,x2,x3 are presentet at the corrects spots). What toping is still unclear: Coloring.
Now I'm coloring the datapoints according to their geometrical size by doing color=(x_1x_2x_3)^(1/3)
What i want: Color the Datapoints according to the name of column or the first row of dataframe or whatever (I will have to add this row, but that shall not be a problem).
Any ideas? Thank you!
x
,y
andz
of plotly.express.scatter_3d() should be str or int or Series or array-like.df.filter(like='1')
returns a dataframe.You can use numpy.ravel() to flatten the values in column direction.