Appending Pandas DataFrame when 2 Indexes

724 views Asked by At

An empty pandas DataFrame is created:

results = pd.DataFrame(columns=['age','timestamp','score']).set_index(['age', 'timestamp'])

and more DataFrames will be appended to the initial results DataFrame.

    result = pd.DataFrame({'age': age,
                          'timestamp': timestamp,
                          'score': score
                            }).set_index(['age', 'timestamp'])

    # error then occurs at this point

    results.append(result)

and we get the error

ValueError: If using all scalar values, you must pass an index

Whats the proper way to append the second DataFrame?

1

There are 1 answers

0
Jianxun Li On BEST ANSWER

Try this. Since your newly added record only have one row. Initializing it via a new dataframe introduces overheads. Just pass the dict to the current df via .loc would work in your case.

Note that adding records one by one is not performance-efficient. But if this is part of your code logic which is unavoidable, then .loc will give you performance far better than pd.append() or pd.concat().

import pandas as pd
import numpy as np
import datetime as dt

# create an empty df
results = pd.DataFrame(columns=['age', 'timestamp', 'score'])
Out[71]: 
Empty DataFrame
Columns: [age, timestamp, score]
Index: []

# write new record in dict, make sure the keys match df column names
new_record = {'age': 23, 'timestamp': dt.datetime(2015,1,1), 'score':98}
# use .loc to enlarge the current df
results.loc[len(results)] = new_record

Out[73]: 
   age  timestamp  score
0   23 2015-01-01     98