I have 5000 json data points which I am iterating and holding data in dataframe.
Initially I am adding data in series list and thereafter adding it into dataframe using below code

1. (5000 times)pd.Series([trading_symbol, instrument_token], index=stock_instrument_token_df.columns)

then:

2. (once) stock_instrument_token_df.append(listOfSeries, ignore_index=True)

time taken in executing 1 is around 700-800 ms and 2 is around 200-300ms
So overall it takes around 1 second for this process

Before this I am iterating through another set of 50,000 json data points and adding them into python dict. That takes around 300 ms

Is there any faster way to do insertion in data frame.
Is there something wrong the way I am looping through data or inserting in data frame ?
Is there any faster way to get work done in dataframe?

Complete code as requested, if it helps

stock_instrument_token_df = pd.DataFrame(columns=['first', 'second'])
            listOfSeries = []
            for data in api_response:
                trading_symbol = data[Constants.tradingsymbol]
                instrument_token = data[Constants.instrument_token]
                listOfSeries.append(
                    pd.Series([trading_symbol, instrument_token], index=stock_instrument_token_df.columns))
            stock_instrument_token_df = stock_instrument_token_df.append(listOfSeries, ignore_index=True)

0 Answers