I have an excel sheet (Bloomberg Data License output) I read in with
import pandas as pd
raw_data = pd.read_excel('my-file.xlsx')
There is one column (START-OF-FILE
) and a varying number rows, depending on the amount of data returned.
I am interested in the data between two rows, specifically START-OF-DATA
and END-OF-DATA
. The rows in the column look like
19 START-OF-DATA
20 WTS Equity|0|6|WTS|50545|54.440000|54.000000|5...
21 XOM Equity|0|6|XOM|6555175|84.950000|85.300000...
22 SUP Equity|0|6|SUP|27405|19.250000|19.200000|1...
23 END-OF-DATA
with a varying number of rows, (not always 20 to 22). How can I filter the rows in the column to only the data between the cells, ie raw_data['START-OF-FILE']['START-OF-DATA' : 'END-OF-DATA']
. and then use str.split('|')
to seperate the pipe delimited data into seperate columns for a new dataframe?