I have a csv file with the following structure:
A, 10
B, 11
C, 8
D, 12
A, 21
B, 7
D, 22
D, 15
C, 111
D, 50
A, 41
B, 32
C, 19
D, 11
I want to read the entire file, and save the data on the second column, if the row is like A, B, C, D format. I have a list:
my_list = [A, B, C, D]
And I check each 4 row, if it is in my_list
format, then read and save to the database like:
with open('csv_file.csv', 'rb') as csvfile:
the_file = csv.reader(csvfile.read().splitlines())
for idx, row in enumerate(islice(zip(*[the_file]*4), 100)):
my_model = Django_model()
if row[0][0] == my_list[0]:
if row[0][0] == my_list[0] and row[1][0] == my_list[1] and \
row[2][0] == my_list[2] and row[3][0] == my_list[3]:
my_model.a_field = row[0][1]
my_model.b_field = row[1][1]
my_model.c_field = row[2][1]
my_model.d_field = row[3][1]
my_model.save()
The fact is that it is working if and ONLY if the structure of the row
is the same as my_list
. But when it gets to the messy part (A, B, D, D, C, D) it will not read the rows properly, and therefore many rows are skipped.
The question is that, how can I jump to the next interesting row (which follows my_list
format) and read it? Meanwhile save the the skipped rows in another list?
I have heard that Pandas could help, but I have been through the documentation, and I couldn't find a way to solve this case.
You can extract the pattern and corresponding values like this:
Output: