I have a numpy array of 100 predicted values called first_100. If I convert these to a dataframe they are indexed as 0,1,2 etc. However, the predictions are for values that are in random indexed order, 66,201,32 etc. I want to be able to put the actual values and the predictions in the same dataframe, but I'm really struggling.

The real values are in a dataframe called first_100_train. I've tried the following:

pd.concat([first_100, first_100_train], axis=1)

This doesn't work and for some reason returns the entire dataframe and indexed from 0 so there are lots of NaNs...

first_100_train['Prediction'] = first_100[0]

This is almost what I want, but again because the indexes are different the data doesn't match up. I'd really appreciate any suggestions.

EDIT: After managing to join the dataframes I now have this:

enter image description here

I'd like to be able to drop the final column...

Here is first_100.head()

enter image description here

and first_100_train.head()

enter image description here

The problem is that index 2 from first_100 actually corresponds to index 480 of first_100_train

1 Answers

0
jezrael On Best Solutions

Set default index values by DataFrame.reset_index and drop=True for correct alignment:

pd.concat([first_100.reset_index(drop=True), 
           first_100_train.reset_index(drop=True)], axis=1)

Or if first DataFrame have default RangeIndex solution is simplify:

pd.concat([first_100, 
           first_100_train.reset_index(drop=True)], axis=1)